Elixir AI
Mix.install([
{:kino, "~> 0.9.0"},
{:benchee, "~> 1.1"},
{:nx, "~> 0.5.2"},
{:axon, "~> 0.5.1"},
{:exla, "~> 0.5.2"}
])
Artificial Intelligence With Elixir
Elixir supports the capabilities of performing computations capable of producing artificial intelligence. This is accomplished with the following libraries:
-
Nx
- multi-dimensional tensor library with multi-staged compilation to the CPU/GPU -
Axon
- high-level interface for creating neural network models
Nx: Multi-dimensional Tensors and Numerical Expressions
The following code cells in this section are from, or derivatives of, the Intro to Nx guide in the hexdocs
A simple tensor
t = Nx.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
Since tensors are used to perform multi-dimensional mathematics, they have a shape associated with them
t |> Nx.shape()
Nx.Tensor
can be created from List
t = 1..4 |> Enum.chunk_every(2) |> Nx.tensor(names: [:y, :x])
The above Nx.Tensor
has named dimensions so they can be accessed accordingly
%{"first column" => t[x: 0], "first row" => t[y: 0]}
Exercise:
-
Create a
{3,3}
tensor with named dimensions -
Return a
{2,2}
tensor containing the first two columns of the first two rows
t =
1..9
|> Enum.chunk_every(3)
|> Nx.tensor(names: [:j, :i])
t = t[j: 0..1][i: 0..1]
Tensor Aware Functions
t
The Nx
module has many functions to do scalar operations on a Nx.Tensor
t |> Nx.cos()
You can call functions that aggregates the contents of a tensor for example to get the sum of the numbers in a Nx.Tensor
t |> Nx.sum()
# Sum of rows in tensor
t |> Nx.sum(axes: [:i])
Exercise
-
Create a
{2, 2, 2}
tensor -
With values
1..8
-
With dimension names
[:z, :y, :x]
-
Calculate the sums along the
:y
axis
1..8
|> Enum.chunk_every(4)
|> Enum.map(&Enum.chunk_every(&1, 2))
|> Nx.tensor(names: [:z, :y, :x])
|> Nx.sum(axes: [:y])
Other matrix operations such as subtraction are available via the Nx
module
a = Nx.tensor([[5, 6], [7, 8]])
b = Nx.tensor([[1, 2], [3, 4]])
a |> Nx.subtract(b)
Broadcasting
Nx.broadcast/2
takes a tensor or a scalar and a shape, translating it to a compatible shape by copying it
Nx.broadcast(1, {2, 2})
a = Nx.tensor([[1, 2], [3, 4]])
# Want to do [[1, 2], [3, 4]] - 1 (subtract 1 from every element in the LHS)
b = Nx.subtract(a, Nx.broadcast(1, {2, 2}))
b == Nx.subtract(a, 1)
# Here we pass a tensor to Nx.broadcast/2 and it will extract it's shape to make a compatible operation
b == Nx.subtract(a, Nx.broadcast(1, a))
# Subtract row (or column) wise
# Want to do [[1, 2], [3, 4]] - [[1, 2]] === [[1, 2], [3, 4]] - [[1, 2], [1, 2]] === [[0, 0], [2, 2]]
a = Nx.tensor([[1, 2], [3, 4]])
b = Nx.tensor([[1, 2]])
c = a |> Nx.subtract(Nx.broadcast(b, {2, 2}))
# The subtraction function will take care of the broadcast implicitly
c2 = a |> Nx.subtract(b)
c == c2
Automatic Differentiation (autograd)
Gradients are critical for solving systems of equations and building probablistic models. In advanced math, derivatives, or differential equations, are use to take gradients. Nx can compute these derivatives automatically throught a feature called auomatic differentiation, or autograd
defmodule Funs do
import Nx.Defn
defn poly(x) do
3 * Nx.pow(x, 2) + 2 * x + 1
end
defn poly_slope_at(x) do
grad(&poly/1).(x)
end
defn sinwave(x) do
Nx.sin(x)
end
defn sinwave_slope_at(x) do
grad(&sinwave/1).(x)
end
end
The function grad/1
takes a function and returns a function returning the gradient
You can check if this value is correct by looking at the graph of 6x + 2
Funs.poly_slope_at(2)
You can check if this value is correct by looking at the graph of acos(x)
Funs.sinwave_slope_at(1)
Axon: High-Level Interface For Neural Network Models
Axon is high-level interface for creating neural network models
Axon is built entirely on top of Nx numerical definitions, so every neural network can be JIT or AOT compiled using any Nx compiler, or even transformed into high-level neural network formats like TensorFlow Lite and ONNX.
First Model (Identity Model)
Everything in Axon
centers around the %Axon{}
struct which represents an instance of an Axon model
Models are graphs which represent transformation and flow of input data to a desired output. You can think of models as representing a single computation or function
All Axon models start with a declaration of input noes. These are the root nodes of your computation graph, and correspond to the actual input data you want to sent to Axon:
model = Axon.input("data")
input
, technically speaking, is now a valid Axon
model which you can inspect, execute and initialize
template = Nx.template({2, 8}, :f32)
model |> Axon.Display.as_graph(template)
The execution flow is just a single node, because the graph consists of an input node. You pass dat in and the models returns the same data, without any intermediate transformations
You can build the %Axon{}
struct into it’s initialization
and forward
functions by calling Axon.build/2
. This pattern of “lowering” or transforming the %Axon{}
struct into other functions or representations is very common in Axon
. By traversing the data structure you can create useful functions, execution visualizations, and more
{init_fn, predict_fn} = Axon.build(model)
init_fn
returns all of your model’s trainable parameters and state. You need to pass a template and any initial parameters you want your model to start with (this is useful for things like transfer learning)
predict_fn
returns transformed inputs from your model’s trainable parameters and the given inputs
training_params = init_fn.(template, %{})
The init_fn/2
returned %{}
because model
does not have any trainable parameters. This should make sense because it’s just an input layer
input = 1..8 |> Enum.chunk_every(4) |> Nx.tensor(type: :f32)
predict_fn.(training_params, input)
Passing the training_params
and some input
to the predict_fn
, the model can actually be executed, returning the given input as expected
Sequential Models
Sequential models are named after the sequential nature in which data flows through them. Sequential models transform the input with sequential, successive transformations
To create a sequential model in Axon
is the same as writing sequential transformation in “regular” Elixir
Links:
model =
Axon.input("data")
# layer with 32 outputs
|> Axon.dense(32)
# layer with an element wise operation of :relu
|> Axon.activation(:relu)
# layer to reduce overfitting, effective regularization method
|> Axon.dropout(rate: 0.5)
# layer with 1 output
|> Axon.dense(1)
# layer with an element wise operation of :softmax
|> Axon.activation(:softmax)
Visualizing the model we can see how the data will flow
template = Nx.template({4, 8}, :f32)
Axon.Display.as_graph(model, template)
{init_fn, predict_fn} = Axon.build(model)
training_params = init_fn.(template, %{})
This model actually has trainable parameters. The parameter map is just a regular Elixir map. Each top-level entry maps to a layer with a key corresponding to that layer’s name and a value corresponding to that layer’s trainable parameters. Each layer’s individual trainable parameters are given layer-specific names and map directly to Nx
tensors
input = Nx.iota({4, 8}, type: :f32)
predict_fn.(training_params, input)
Complex Models
Some models require a more flexible API. Since Axon
models are just Elixir data structures, you can manipulate them and decompose architectures as you would any other Elixir program
input = Axon.input("data")
x1 = input |> Axon.dense(32)
x2 = input |> Axon.dense(64) |> Axon.relu() |> Axon.dense(32)
model = Axon.add(x1, x2)
Your model branches input
into x1
and x2
. Each branch performs a different set of transformations. At the end, the branches are merged with Axon.add/3
. The Layer that is created with Axon.add/3
is sometimes called a combinator. It is a layer that operates on multiple Axon
models at once, typically to merge some branches together
model
represents the final Axon
model
Visualizing this model you can see the fully built branching in this model
template = Nx.template({2, 16}, :f32)
Axon.Display.as_graph(model, template)
{init_fn, predict_fn} = Axon.build(model)
training_params = init_fn.(template, %{})
As your model’s architecture grows in complexity, you might find yourself reaching for better abstractions to organize your model creation code. PyTorch models are often organized into nn.Module
. If you’re translating models from PyTorch to Axon, it’s natural to create one Elixir function per nn.Module
.
You should write your models as you would any other Elixir code
defmodule ComplexModel do
def create() do
Axon.input("data")
|> conv_block()
|> Axon.flatten()
|> dense_block()
|> dense_block()
|> Axon.dense(1)
end
defp conv_block(input) do
input
|> Axon.conv(3, padding: :same)
|> Axon.mish()
|> Axon.add(input)
|> Axon.max_pool(kernel_size: {2, 3})
end
defp dense_block(input) do
input
|> Axon.dense(32)
|> Axon.relu()
end
end
model = ComplexModel.create()
template = Nx.template({1, 28, 28, 3}, :f32)
Axon.Display.as_graph(model, template)
Multi-Input & Multi-Output Models
Multi-Input Model
Sometimes your model has the need for multiple inputs
input1 = Axon.input("input1")
input2 = Axon.input("input2")
model = Axon.add(input1, input2)
You can inspect the inputs of your model in two ways: Axon.Display.as_graph/1
and Axon.get_inputs/1
Axon.get_inputs(model)
inputs = %{"input1" => Nx.template({2, 8}, :f32), "input2" => Nx.template({2, 8}, :f32)}
Axon.Display.as_graph(model, inputs)
{init_fn, predict_fn} = Axon.build(model)
training_params = init_fn.(inputs, %{})
input1 = Nx.iota({2, 8}, type: :f32)
input2 = Nx.iota({2, 8}, type: :f32)
inputs = %{"input1" => input1, "input2" => input2}
predict_fn.(training_params, inputs)
Multi-Output Models
You also might want to have multiple outputs from your model. Axon.container/2
can be used to wrap multiple nodes into any supported Nx
container:
data = Axon.input("data")
x1 = data |> Axon.dense(32) |> Axon.relu()
x2 = data |> Axon.dense(64) |> Axon.relu()
model = Axon.container({x1, x2})
template = Nx.template({2, 8}, :f32)
Axon.Display.as_graph(model, template)
{init_fn, predict_fn} = Axon.build(model)
params = init_fn.(template, %{})
data = Nx.iota({2, 8}, type: :f32)
predict_fn.(params, data)
Creating Custom Layers
Axon
has a number of built-in layers such as Axon.relu/0
and Axon.dense/1
. As you develop more sophisticated models you will likely need to develop custom layers
A layer in Axon
is just a defn
implementation with special Axon
inputs. Every layer in Axon
is implemented with the Axon.layer/3
function. The API of Axon.layer/3
intentionally mirrors Kernel.apply/2
to make developing custom layers as close to writing “normal” Elixir code as possible.
defn
looks like any other def
implementation except defn
implementations must always account for opts
as a second parameter because it will always receive a :mode
option indicating whether or not the model is running in training or inference mode. This allows you to custom the behavior of the layer based on the execution mode.
If you plan on re-using customer layers in many locations, it’s recommended that you wrap them in Elixir functions as an interface. If you were to try and use defn
instead of def
you would receive an error about a LazyContainer
being incorrectly used
defmodule CustomLayers do
import Nx.Defn
def my_layer(%Axon{} = input, opts \\ []) do
opts = Keyword.validate!(opts, [:name])
alpha = Axon.param("alpha", fn _ -> {} end)
Axon.layer(&my_layer_impl/3, [input, alpha], name: opts[:name], op_name: :my_layer)
end
defnp my_layer_impl(input, alpha, _opts \\ []) do
input
|> Nx.sin()
|> Nx.multiply(alpha)
end
end
With the layer implementation defined, Axon.layer/2
can be used to apply the layer to a model
model =
Axon.input("data")
|> CustomLayers.my_layer()
|> CustomLayers.my_layer()
|> Axon.dense(1)
Axon.Display.as_graph(model, Nx.template({2, 8}, :f32))
{init_fn, predict_fn} = Axon.build(model)
params = init_fn.(Nx.template({2, 8}, :f32), %{})
predict_fn.(params, Nx.iota({2, 8}, type: :f32))
Model Hooks
Inspecting or visualizing the values of intermediate layers in your model during the forward or backwards pass can be key to understanding the behavior of your model (e.g., visualizing the gradients of activation functions to ensure learning in a stable manner). Axon supports this via model hooks.
Model hooks are unidirectional communication with an executing model. Hooks are unidirectional in the sense that you can only receive information from your model, and not send information to the model.
Hooks attach per-layer and can execute at 4 different points in model execution: initalization
, pre-forward
, forward pass
, or backward pass
input = Nx.iota({2, 4}, type: :f32)
model =
Axon.input("data")
|> Axon.dense(8)
|> Axon.attach_hook(fn val -> IO.inspect(val, label: :dense_init) end, on: :initialize)
|> Axon.attach_hook(fn val -> IO.inspect(val, label: :dense_forward) end, on: :forward)
|> Axon.relu()
|> Axon.attach_hook(fn val -> IO.inspect(val, label: :relu) end, on: :forward)
{init_fn, predict_fn} = Axon.build(model)
params = init_fn.(input, %{})
predict_fn.(params, input)
Hooks execute in the order they were attached to a layer. If you attach 2 hooks to the same layer which execute different functions on the same event, they will run in order
model =
Axon.input("data")
|> Axon.dense(8)
|> Axon.attach_hook(fn val -> IO.inspect(val, label: :hook1) end, on: :forward)
|> Axon.attach_hook(fn val -> IO.inspect(val, label: :hook2) end, on: :forward)
|> Axon.relu()
{init_fn, predict_fn} = Axon.build(model)
params = init_fn.(input, %{})
predict_fn.(params, input)
Hooks can also be configured to run on all events
model =
Axon.input("data")
|> Axon.dense(8)
|> Axon.attach_hook(&IO.inspect/1, on: :all)
|> Axon.relu()
|> Axon.dense(1)
{init_fn, predict_fn} = Axon.build(model)
initialization
hook
params = init_fn.(input, %{})
pre-forward
and forward
hooks
predict_fn.(params, input)
backwards
hook
Nx.Defn.grad(fn params -> predict_fn.(params, input) end).(params)
Hooks can be configured to only run when the model is built in a certain mode such as training
and inference
mode
model =
Axon.input("data")
|> Axon.dense(8)
|> Axon.attach_hook(&IO.inspect/1, on: :forward, mode: :train)
|> Axon.relu()
{init_fn, predict_fn} = Axon.build(model, mode: :train)
params = init_fn.(input, %{})
predict_fn.(params, input)
Now building with :inference
mode, the hook will not run
{init_fn, predict_fn} = Axon.build(model, mode: :inference)
params = init_fn.(input, %{})
predict_fn.(params, input)