Cluster Sampling Demo: Livebook → eXMC via :erpc
The Setup
This notebook runs inside the livebook jail (10.17.89.13).
The sampling computation runs on the exmc jail (10.17.89.14).
They’re connected via distributed Erlang over bastille0 — same
cookie, same cluster, different jails.
No HTTP APIs. No message queues. Just :erpc.call/4.
# Verify we're in the cluster
IO.puts("I am: #{Node.self()}")
IO.puts("My peers: #{inspect(Node.list())}")
# Connect to exmc if not already connected
unless :"exmc@10.17.89.14" in Node.list() do
Node.connect(:"exmc@10.17.89.14")
end
IO.puts("exmc connected: #{:"exmc@10.17.89.14" in Node.list()}")
Define the Model
A simple Bayesian model: estimate the mean of some observed data. We define the model here in Livebook but it’s just data (an IR struct) — portable across nodes.
# Build the model IR — this is just a data structure, no computation yet
model_code = """
alias Exmc.{Builder, Dist}
ir = Builder.new_ir()
ir = Builder.rv(ir, "mu", Dist.Normal, %{mu: Nx.tensor(0.0), sigma: Nx.tensor(10.0)})
ir = Builder.rv(ir, "sigma", Dist.HalfNormal, %{sigma: Nx.tensor(2.0)})
ir = Builder.rv(ir, "y", Dist.Normal, %{mu: "mu", sigma: "sigma"})
observations = Nx.tensor([2.1, 2.5, 1.8, 2.3, 2.7, 1.9, 2.4, 2.2, 2.6, 2.0])
ir = Builder.obs(ir, "y_obs", "y", observations)
ir
"""
IO.puts("Model defined (10 observations, estimating mu and sigma)")
IO.puts("Will sample on: exmc@10.17.89.14")
Run 4 Chains on eXMC (Remote)
Four independent MCMC chains run in parallel on the exmc node.
We send the model over via :erpc, exmc runs sample_chains/3,
and returns all 4 traces back to Livebook as Nx tensors.
# Ship 4-chain sampling to exmc via :erpc
target = :"exmc@10.17.89.14"
{time_us, chains} = :timer.tc(fn ->
:erpc.call(target, fn ->
alias Exmc.{Builder, Dist}
ir = Builder.new_ir()
ir = Builder.rv(ir, "mu", Dist.Normal, %{mu: Nx.tensor(0.0), sigma: Nx.tensor(10.0)})
ir = Builder.rv(ir, "sigma", Dist.HalfNormal, %{sigma: Nx.tensor(2.0)})
ir = Builder.rv(ir, "y", Dist.Normal, %{mu: "mu", sigma: "sigma"})
observations = Nx.tensor([2.1, 2.5, 1.8, 2.3, 2.7, 1.9, 2.4, 2.2, 2.6, 2.0])
ir = Builder.obs(ir, "y_obs", "y", observations)
# 4 chains, 1000 draws each, 500 warmup
Exmc.NUTS.Sampler.sample_chains(ir, 4, num_samples: 1000, num_warmup: 500)
end, 60_000) # 60s timeout
end)
IO.puts("4 chains completed in #{div(time_us, 1000)}ms on #{target}")
IO.puts("Chains returned: #{length(chains)}")
Inspect the Posterior (4 chains)
Four traces came back from exmc as Nx tensors — serialized over distributed Erlang, deserialized here in Livebook. Zero ceremony.
# Summarize across all 4 chains
for {chain, i} <- Enum.with_index(chains, 1) do
mu_samples = chain["mu"]
sigma_samples = chain["sigma"]
mu_mean = Nx.mean(mu_samples) |> Nx.to_number() |> Float.round(3)
sigma_mean = Nx.mean(sigma_samples) |> Nx.to_number() |> Float.round(3)
IO.puts(" Chain #{i}: mu=#{mu_mean}, sigma=#{sigma_mean}")
end
# Pool all chains
all_mu = chains |> Enum.map(& &1["mu"]) |> Nx.concatenate()
all_sigma = chains |> Enum.map(& &1["sigma"]) |> Nx.concatenate()
IO.puts("""
Pooled posterior (#{Nx.size(all_mu)} total samples across 4 chains):
mu: #{Nx.mean(all_mu) |> Nx.to_number() |> Float.round(3)} ± #{Nx.standard_deviation(all_mu) |> Nx.to_number() |> Float.round(3)}
sigma: #{Nx.mean(all_sigma) |> Nx.to_number() |> Float.round(3)}
Computed on: exmc@10.17.89.14
Displayed on: #{Node.self()}
Transport: :erpc over distributed Erlang (bastille0 loopback)
""")
What This Demonstrates
- Livebook is the interactive frontend (notebook UI)
- eXMC is the compute backend (NUTS sampler)
- Connection: distributed Erlang over bastille0, shared cookie from ZFS secrets
-
Serialization: Nx tensors travel over
:erpctransparently - No infrastructure between them: no REST API, no message queue, no service mesh
This is what “off docker-compose” looks like for compute workloads: the cluster IS the API.