Powered by AppSignal & Oban Pro

Runtime introspection with VegaLite

vm_introspection.livemd

Runtime introspection with VegaLite

Mix.install([
  {:kino, "~> 0.6.1"},
  {:kino_vega_lite, "~> 0.1.1"}
])

alias VegaLite, as: Vl

Introduction

In this notebook, we will use Kino and VegaLite to introspect and plot how our system behaves over time. If you are not familiar with VegaLite, read its introductory notebook.

You can see both dependencies listed in the setup cell above. We also define a convenience shortcut for the VegaLite module, let’s run the setup and we are ready to go!

Connecting to a remote node

Our goal is to introspect an Elixir node. The code we will write in this notebook can be used to introspect any running Elixir node. It can be a development environment that you would start with:

iex --name my_app@IP -S mix TASK

Or a production node assembled via mix release.

In order to connect two nodes, we need to know their node name and their cookie. We can get this information for the Livebook runtime like this:

IO.puts node()
IO.puts Node.get_cookie()

We will capture this information using Kino inputs. However, for convenience, we will use the node and cookie of the current notebook as default values. This means that, if you don’t have a separate Elixir, the runtime will connect and introspect itself. Let’s render the inputs:

node_input = Kino.Input.text("Node", default: node())
cookie_input = Kino.Input.text("Cookie", default: Node.get_cookie())

Kino.render(node_input)
Kino.render(cookie_input)
:ok

Now let’s read the inputs, configure the cookie, and connect to the other node:

node =
  node_input
  |> Kino.Input.read()
  |> String.to_atom()

cookie =
  cookie_input
  |> Kino.Input.read()
  |> String.to_atom()

Node.set_cookie(node, cookie)
true = Node.connect(node)

Having successfully connected, let’s try spawning a process on the remote node!

Node.spawn(node, fn ->
  IO.inspect(node())
end)

Inspecting processes

Now we are going to extract some information from the running node on our own!

Let’s get the list of all processes in the system:

remote_pids = :rpc.call(node, Process, :list, [])

Wait, but what is this :rpc.call/4 thing? 🤔

Previously we used Node.spawn/2 to run a process on the other node and we used the IO module to get some output. However, now we actually care about the resulting value of Process.list/0!

We could still use Node.spawn/2 to send us the results, which we would receive, but doing that over and over can be quite tedious. Fortunately, :rpc.call/4 does essentially that - evaluates the given function on the remote node and returns its result.

Now, let’s gather more information about each process 🕵️

processes =
  Enum.map(remote_pids, fn pid ->
    # Extract interesting process information
    info = :rpc.call(node, Process, :info, [pid, [:reductions, :memory, :status]])
    # The result of inspect(pid) is relative to the node
    # where it was called, that's why we call it on the remote node
    pid_inspect = :rpc.call(node, Kernel, :inspect, [pid])

    %{
      pid: pid_inspect,
      reductions: info[:reductions],
      memory: info[:memory],
      status: info[:status]
    }
  end)

Having all that data, we can now visualize it on a scatter plot using VegaLite:

Vl.new(width: 600, height: 400)
|> Vl.data_from_values(processes)
|> Vl.mark(:point, tooltip: true)
|> Vl.encode_field(:x, "reductions", type: :quantitative, scale: [type: "log", base: 10])
|> Vl.encode_field(:y, "memory", type: :quantitative, scale: [type: "log", base: 10])
|> Vl.encode_field(:color, "status", type: :nominal)
|> Vl.encode_field(:tooltip, "pid", type: :nominal)

From the plot we can easily see which processes do the most work and take the most memory.

Tracking memory usage

So far we have used VegaLite to draw static plots. However, we can use Kino to dynamically push data to VegaLite. Let’s use them together to plot the runtime memory usage over time.

There’s a very simple way to determine current memory usage in the VM:

:erlang.memory()

Now let’s build a dynamic VegaLite graph. Instead of returning the VegaLite specification as is, we will wrap it in Kino.VegaLite.new/1 to make it dynamic:

memory_plot =
  Vl.new(width: 600, height: 400, padding: 20)
  |> Vl.repeat(
    [layer: ["total", "processes", "atom", "binary", "code", "ets"]],
    Vl.new()
    |> Vl.mark(:line)
    |> Vl.encode_field(:x, "iter", type: :quantitative, title: "Measurement")
    |> Vl.encode_repeat(:y, :layer, type: :quantitative, title: "Memory usage (MB)")
    |> Vl.encode(:color, datum: [repeat: :layer], type: :nominal)
  )
  |> Kino.VegaLite.new()

Now we can use Kino.VegaLite.periodically/4 to create a self-updating plot of memory usage over time on the remote node:

Kino.VegaLite.periodically(memory_plot, 200, 1, fn i ->
  point =
    :rpc.call(node, :erlang, :memory, [])
    |> Enum.map(fn {type, bytes} -> {type, bytes / 1_000_000} end)
    |> Map.new()
    |> Map.put(:iter, i)

  Kino.VegaLite.push(memory_plot, point, window: 1000)
  {:cont, i + 1}
end)

Unless you connected to a production node, the memory usage most likely doesn’t change, so to emulate some spikes you can run the following code:

Binary usage

for i <- 1..10_000 do
  String.duplicate("cat", i)
end

ETS usage

tid = :ets.new(:users, [:set, :public])

for i <- 1..1_000_000 do
  :ets.insert(tid, {i, "User #{i}"})
end

In the next notebook, we will learn how to use Kino.Control to build a chat app!