Powered by AppSignal & Oban Pro
Would you like to see your link here? Contact us

Elixir AI

elixir/exai.livemd

Elixir AI

Mix.install([
  {:kino, "~> 0.9.0"},
  {:benchee, "~> 1.1"},
  {:nx, "~> 0.5.2"},
  {:axon, "~> 0.5.1"},
  {:exla, "~> 0.5.2"}
])

Artificial Intelligence With Elixir

Elixir supports the capabilities of performing computations capable of producing artificial intelligence. This is accomplished with the following libraries:

  • Nx - multi-dimensional tensor library with multi-staged compilation to the CPU/GPU
  • Axon - high-level interface for creating neural network models

Nx: Multi-dimensional Tensors and Numerical Expressions

Nx Hexdocs

The following code cells in this section are from, or derivatives of, the Intro to Nx guide in the hexdocs

A simple tensor

t = Nx.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]])

Since tensors are used to perform multi-dimensional mathematics, they have a shape associated with them

t |> Nx.shape()

Nx.Tensor can be created from List

t = 1..4 |> Enum.chunk_every(2) |> Nx.tensor(names: [:y, :x])

The above Nx.Tensor has named dimensions so they can be accessed accordingly

%{"first column" => t[x: 0], "first row" => t[y: 0]}

Exercise:

  • Create a {3,3} tensor with named dimensions
  • Return a {2,2} tensor containing the first two columns of the first two rows
t =
  1..9
  |> Enum.chunk_every(3)
  |> Nx.tensor(names: [:j, :i])

t = t[j: 0..1][i: 0..1]

Tensor Aware Functions

t

The Nx module has many functions to do scalar operations on a Nx.Tensor

t |> Nx.cos()

You can call functions that aggregates the contents of a tensor for example to get the sum of the numbers in a Nx.Tensor

t |> Nx.sum()
# Sum of rows in tensor
t |> Nx.sum(axes: [:i])

Exercise

  • Create a {2, 2, 2} tensor
  • With values 1..8
  • With dimension names [:z, :y, :x]
  • Calculate the sums along the :y axis
1..8
|> Enum.chunk_every(4)
|> Enum.map(&Enum.chunk_every(&1, 2))
|> Nx.tensor(names: [:z, :y, :x])
|> Nx.sum(axes: [:y])

Other matrix operations such as subtraction are available via the Nx module

a = Nx.tensor([[5, 6], [7, 8]])
b = Nx.tensor([[1, 2], [3, 4]])

a |> Nx.subtract(b)

Broadcasting

Nx.broadcast/2 takes a tensor or a scalar and a shape, translating it to a compatible shape by copying it

Nx.broadcast(1, {2, 2})
a = Nx.tensor([[1, 2], [3, 4]])

# Want to do [[1, 2], [3, 4]] - 1 (subtract 1 from every element in the LHS)

b = Nx.subtract(a, Nx.broadcast(1, {2, 2}))

b == Nx.subtract(a, 1)

# Here we pass a tensor to Nx.broadcast/2 and it will extract it's shape to make a compatible operation
b == Nx.subtract(a, Nx.broadcast(1, a))
# Subtract row (or column) wise
# Want to do [[1, 2], [3, 4]] - [[1, 2]] === [[1, 2], [3, 4]] - [[1, 2], [1, 2]] === [[0, 0], [2, 2]]

a = Nx.tensor([[1, 2], [3, 4]])
b = Nx.tensor([[1, 2]])
c = a |> Nx.subtract(Nx.broadcast(b, {2, 2}))

# The subtraction function will take care of the broadcast implicitly
c2 = a |> Nx.subtract(b)

c == c2

Automatic Differentiation (autograd)

Gradients are critical for solving systems of equations and building probablistic models. In advanced math, derivatives, or differential equations, are use to take gradients. Nx can compute these derivatives automatically throught a feature called auomatic differentiation, or autograd

defmodule Funs do
  import Nx.Defn

  defn poly(x) do
    3 * Nx.pow(x, 2) + 2 * x + 1
  end

  defn poly_slope_at(x) do
    grad(&poly/1).(x)
  end

  defn sinwave(x) do
    Nx.sin(x)
  end

  defn sinwave_slope_at(x) do
    grad(&sinwave/1).(x)
  end
end

The function grad/1 takes a function and returns a function returning the gradient

You can check if this value is correct by looking at the graph of 6x + 2

Funs.poly_slope_at(2)

You can check if this value is correct by looking at the graph of acos(x)

Funs.sinwave_slope_at(1)

Axon: High-Level Interface For Neural Network Models

Hexdocs

Axon is high-level interface for creating neural network models

Axon is built entirely on top of Nx numerical definitions, so every neural network can be JIT or AOT compiled using any Nx compiler, or even transformed into high-level neural network formats like TensorFlow Lite and ONNX.

First Model (Identity Model)

Everything in Axon centers around the %Axon{} struct which represents an instance of an Axon model

Models are graphs which represent transformation and flow of input data to a desired output. You can think of models as representing a single computation or function

All Axon models start with a declaration of input noes. These are the root nodes of your computation graph, and correspond to the actual input data you want to sent to Axon:

model = Axon.input("data")

input, technically speaking, is now a valid Axon model which you can inspect, execute and initialize

template = Nx.template({2, 8}, :f32)
model |> Axon.Display.as_graph(template)

The execution flow is just a single node, because the graph consists of an input node. You pass dat in and the models returns the same data, without any intermediate transformations

You can build the %Axon{} struct into it’s initialization and forward functions by calling Axon.build/2. This pattern of “lowering” or transforming the %Axon{} struct into other functions or representations is very common in Axon. By traversing the data structure you can create useful functions, execution visualizations, and more

{init_fn, predict_fn} = Axon.build(model)

init_fn returns all of your model’s trainable parameters and state. You need to pass a template and any initial parameters you want your model to start with (this is useful for things like transfer learning)

predict_fn returns transformed inputs from your model’s trainable parameters and the given inputs

training_params = init_fn.(template, %{})

The init_fn/2 returned %{} because model does not have any trainable parameters. This should make sense because it’s just an input layer

input = 1..8 |> Enum.chunk_every(4) |> Nx.tensor(type: :f32)
predict_fn.(training_params, input)

Passing the training_params and some input to the predict_fn, the model can actually be executed, returning the given input as expected

Sequential Models

Sequential models are named after the sequential nature in which data flows through them. Sequential models transform the input with sequential, successive transformations

To create a sequential model in Axon is the same as writing sequential transformation in “regular” Elixir

Links:

model =
  Axon.input("data")
  # layer with 32 outputs
  |> Axon.dense(32)
  # layer with an element wise operation of :relu
  |> Axon.activation(:relu)
  # layer to reduce overfitting, effective regularization method
  |> Axon.dropout(rate: 0.5)
  # layer with 1 output
  |> Axon.dense(1)
  # layer with an element wise operation of :softmax
  |> Axon.activation(:softmax)

Visualizing the model we can see how the data will flow

template = Nx.template({4, 8}, :f32)
Axon.Display.as_graph(model, template)
{init_fn, predict_fn} = Axon.build(model)
training_params = init_fn.(template, %{})

This model actually has trainable parameters. The parameter map is just a regular Elixir map. Each top-level entry maps to a layer with a key corresponding to that layer’s name and a value corresponding to that layer’s trainable parameters. Each layer’s individual trainable parameters are given layer-specific names and map directly to Nx tensors

input = Nx.iota({4, 8}, type: :f32)
predict_fn.(training_params, input)

Complex Models

Some models require a more flexible API. Since Axon models are just Elixir data structures, you can manipulate them and decompose architectures as you would any other Elixir program

input = Axon.input("data")
x1 = input |> Axon.dense(32)
x2 = input |> Axon.dense(64) |> Axon.relu() |> Axon.dense(32)
model = Axon.add(x1, x2)

Your model branches input into x1 and x2. Each branch performs a different set of transformations. At the end, the branches are merged with Axon.add/3. The Layer that is created with Axon.add/3 is sometimes called a combinator. It is a layer that operates on multiple Axon models at once, typically to merge some branches together

model represents the final Axon model

Visualizing this model you can see the fully built branching in this model

template = Nx.template({2, 16}, :f32)
Axon.Display.as_graph(model, template)
{init_fn, predict_fn} = Axon.build(model)
training_params = init_fn.(template, %{})

As your model’s architecture grows in complexity, you might find yourself reaching for better abstractions to organize your model creation code. PyTorch models are often organized into nn.Module. If you’re translating models from PyTorch to Axon, it’s natural to create one Elixir function per nn.Module.

You should write your models as you would any other Elixir code

defmodule ComplexModel do
  def create() do
    Axon.input("data")
    |> conv_block()
    |> Axon.flatten()
    |> dense_block()
    |> dense_block()
    |> Axon.dense(1)
  end

  defp conv_block(input) do
    input
    |> Axon.conv(3, padding: :same)
    |> Axon.mish()
    |> Axon.add(input)
    |> Axon.max_pool(kernel_size: {2, 3})
  end

  defp dense_block(input) do
    input
    |> Axon.dense(32)
    |> Axon.relu()
  end
end
model = ComplexModel.create()
template = Nx.template({1, 28, 28, 3}, :f32)
Axon.Display.as_graph(model, template)

Multi-Input & Multi-Output Models

Multi-Input Model

Sometimes your model has the need for multiple inputs

input1 = Axon.input("input1")
input2 = Axon.input("input2")
model = Axon.add(input1, input2)

You can inspect the inputs of your model in two ways: Axon.Display.as_graph/1 and Axon.get_inputs/1

Axon.get_inputs(model)
inputs = %{"input1" => Nx.template({2, 8}, :f32), "input2" => Nx.template({2, 8}, :f32)}
Axon.Display.as_graph(model, inputs)
{init_fn, predict_fn} = Axon.build(model)
training_params = init_fn.(inputs, %{})
input1 = Nx.iota({2, 8}, type: :f32)
input2 = Nx.iota({2, 8}, type: :f32)
inputs = %{"input1" => input1, "input2" => input2}
predict_fn.(training_params, inputs)
Multi-Output Models

You also might want to have multiple outputs from your model. Axon.container/2 can be used to wrap multiple nodes into any supported Nx container:

data = Axon.input("data")
x1 = data |> Axon.dense(32) |> Axon.relu()
x2 = data |> Axon.dense(64) |> Axon.relu()
model = Axon.container({x1, x2})
template = Nx.template({2, 8}, :f32)
Axon.Display.as_graph(model, template)
{init_fn, predict_fn} = Axon.build(model)
params = init_fn.(template, %{})
data = Nx.iota({2, 8}, type: :f32)
predict_fn.(params, data)

Creating Custom Layers

Axon has a number of built-in layers such as Axon.relu/0 and Axon.dense/1. As you develop more sophisticated models you will likely need to develop custom layers

A layer in Axon is just a defn implementation with special Axon inputs. Every layer in Axon is implemented with the Axon.layer/3 function. The API of Axon.layer/3 intentionally mirrors Kernel.apply/2 to make developing custom layers as close to writing “normal” Elixir code as possible.

defn looks like any other def implementation except defn implementations must always account for opts as a second parameter because it will always receive a :mode option indicating whether or not the model is running in training or inference mode. This allows you to custom the behavior of the layer based on the execution mode.

If you plan on re-using customer layers in many locations, it’s recommended that you wrap them in Elixir functions as an interface. If you were to try and use defn instead of def you would receive an error about a LazyContainer being incorrectly used

defmodule CustomLayers do
  import Nx.Defn

  def my_layer(%Axon{} = input, opts \\ []) do
    opts = Keyword.validate!(opts, [:name])
    alpha = Axon.param("alpha", fn _ -> {} end)
    Axon.layer(&my_layer_impl/3, [input, alpha], name: opts[:name], op_name: :my_layer)
  end

  defnp my_layer_impl(input, alpha, _opts \\ []) do
    input
    |> Nx.sin()
    |> Nx.multiply(alpha)
  end
end

With the layer implementation defined, Axon.layer/2 can be used to apply the layer to a model

model =
  Axon.input("data")
  |> CustomLayers.my_layer()
  |> CustomLayers.my_layer()
  |> Axon.dense(1)
Axon.Display.as_graph(model, Nx.template({2, 8}, :f32))
{init_fn, predict_fn} = Axon.build(model)
params = init_fn.(Nx.template({2, 8}, :f32), %{})
predict_fn.(params, Nx.iota({2, 8}, type: :f32))

Model Hooks

Inspecting or visualizing the values of intermediate layers in your model during the forward or backwards pass can be key to understanding the behavior of your model (e.g., visualizing the gradients of activation functions to ensure learning in a stable manner). Axon supports this via model hooks.

Model hooks are unidirectional communication with an executing model. Hooks are unidirectional in the sense that you can only receive information from your model, and not send information to the model.

Hooks attach per-layer and can execute at 4 different points in model execution: initalization, pre-forward, forward pass, or backward pass

input = Nx.iota({2, 4}, type: :f32)
model =
  Axon.input("data")
  |> Axon.dense(8)
  |> Axon.attach_hook(fn val -> IO.inspect(val, label: :dense_init) end, on: :initialize)
  |> Axon.attach_hook(fn val -> IO.inspect(val, label: :dense_forward) end, on: :forward)
  |> Axon.relu()
  |> Axon.attach_hook(fn val -> IO.inspect(val, label: :relu) end, on: :forward)

{init_fn, predict_fn} = Axon.build(model)
params = init_fn.(input, %{})
predict_fn.(params, input)

Hooks execute in the order they were attached to a layer. If you attach 2 hooks to the same layer which execute different functions on the same event, they will run in order

model =
  Axon.input("data")
  |> Axon.dense(8)
  |> Axon.attach_hook(fn val -> IO.inspect(val, label: :hook1) end, on: :forward)
  |> Axon.attach_hook(fn val -> IO.inspect(val, label: :hook2) end, on: :forward)
  |> Axon.relu()

{init_fn, predict_fn} = Axon.build(model)
params = init_fn.(input, %{})
predict_fn.(params, input)

Hooks can also be configured to run on all events

model =
  Axon.input("data")
  |> Axon.dense(8)
  |> Axon.attach_hook(&IO.inspect/1, on: :all)
  |> Axon.relu()
  |> Axon.dense(1)

{init_fn, predict_fn} = Axon.build(model)

initialization hook

params = init_fn.(input, %{})

pre-forward and forward hooks

predict_fn.(params, input)

backwards hook

Nx.Defn.grad(fn params -> predict_fn.(params, input) end).(params)

Hooks can be configured to only run when the model is built in a certain mode such as training and inference mode

model =
  Axon.input("data")
  |> Axon.dense(8)
  |> Axon.attach_hook(&IO.inspect/1, on: :forward, mode: :train)
  |> Axon.relu()

{init_fn, predict_fn} = Axon.build(model, mode: :train)
params = init_fn.(input, %{})
predict_fn.(params, input)

Now building with :inference mode, the hook will not run

{init_fn, predict_fn} = Axon.build(model, mode: :inference)
params = init_fn.(input, %{})
predict_fn.(params, input)