Radon Varying-Intercept Model

notebooks/09_radon_bhm.livemd

Igor O'sten

@borodark

eXMC

Share to X

Share to Bluesky

More notebooks

Radon Varying-Intercept Model

Setup


# CPU only — no GPU required
System.put_env("EXLA_CPU_ONLY", "true")
System.put_env("CUDA_VISIBLE_DEVICES", "")
Mix.install([
  {:exmc, path: Path.expand("../", __DIR__)},
  {:exla, "~> 0.10"},
  {:kino_vega_lite, "~> 0.1"}
])
Application.put_env(:exla, :clients, host: [platform: :host])
Application.put_env(:exla, :default_client, :host)
Nx.default_backend(Nx.BinaryBackend)
Nx.Defn.default_options(compiler: EXLA, client: :host)

Why This Matters

Radon kills 21,000 Americans per year. It seeps from uranium in the soil into basements. Some counties have high radon; some don’t. The EPA needs to know which.

The problem: most counties have fewer than five measurements. A county with two readings of 4.1 and 1.2 pCi/L — what is the true mean? The sample mean (2.65) has a standard error larger than itself. Useless.

But we know something the sample mean doesn’t: counties with high uranium tend to have high radon. And nearby counties tend to be similar. A hierarchical model with uranium as a county-level predictor borrows strength from geology. The county with two measurements gets an estimate informed by 919 counties’ worth of data. The result isn’t a guess — it’s optimal in the mean squared error sense. Stein showed this in 1956. The hierarchical model implements it.

This is the canonical Bayesian hierarchical example — Gelman & Hill’s radon dataset. If your framework can’t do this, it can’t do applied statistics.

The Radon Model

The radon varying-intercept model is the “MNIST of Bayesian hierarchical models” — introduced by Gelman & Hill (2007), it estimates indoor radon levels across counties with partial pooling of county intercepts toward a uranium-informed grand mean.

This is a d=90 model: 5 hyperparameters + 85 county-level intercepts.

Model structure:

mu_alpha ~ Normal(0, 10) — grand intercept mean
gamma_u ~ Normal(0, 5) — uranium coefficient (county-level predictor)
sigma_alpha ~ HalfCauchy(2.5) — county intercept spread
sigma_y ~ HalfCauchy(2.5) — observation noise
beta ~ Normal(0, 5) — floor effect (basement vs first floor)
alpha_raw_j ~ Normal(0, 1) for j = 0..84 — NCP county intercepts
alpha_j = mu_alpha + gamma_u * uranium_j + sigma_alpha * alpha_raw_j
y_{ij} ~ Normal(alpha_j + beta * floor_{ij}, sigma_y)

# Generate synthetic radon data (85 counties, ~919 observations)
Code.require_file("../benchmark/radon_data.exs", __DIR__)
Code.require_file("../benchmark/radon_model.exs", __DIR__)

data = Exmc.Benchmark.RadonData.generate(seed: 42)

IO.puts("Counties: #{data.n_counties}")
IO.puts("Total observations: #{length(data.county_idx)}")
IO.puts("True mu_alpha: #{data.true_params.mu_alpha}")
IO.puts("True beta (floor effect): #{data.true_params.beta}")
IO.puts("True sigma_y: #{data.true_params.sigma_y}")

Building and Sampling

The model uses manual non-centered parameterization (NCP): each county intercept is expressed as alpha_raw_j ~ N(0,1) and reconstructed inside a Custom dist closure. This lets the sampler explore the 85-dimensional intercept space without funnel geometry.

ir = Exmc.Benchmark.RadonModel.build(data)
init = Exmc.Benchmark.RadonModel.init_values(data)

IO.puts("Free parameters: #{map_size(init)}")

t0 = System.monotonic_time(:millisecond)

{trace, stats} =
  Exmc.NUTS.Sampler.sample(ir, init,
    num_warmup: 1000,
    num_samples: 1000,
    seed: 42,
    ncp: false
  )

wall_s = (System.monotonic_time(:millisecond) - t0) / 1000.0
IO.puts("Wall time: #{Float.round(wall_s, 1)}s")
IO.puts("Step size: #{Float.round(stats.step_size, 4)}")
IO.puts("Divergences: #{stats.divergences}")

Posterior Summary

alias Exmc.Diagnostics

params = ["mu_alpha", "gamma_u", "sigma_alpha", "sigma_y", "beta"]
true_vals = [1.4, 0.7, 0.4, 0.7, -0.7]

for {name, true_val} <- Enum.zip(params, true_vals) do
  samples = Nx.to_flat_list(trace[name])
  mean = Enum.sum(samples) / length(samples)
  ess = Diagnostics.ess(samples)

  IO.puts(
    "#{String.pad_trailing(name, 14)} mean=#{Float.round(mean, 3)} " <>
      "(true=#{true_val}) ESS=#{Float.round(ess, 0)}"
  )
end

County-Level Shrinkage

The hallmark of hierarchical models is shrinkage: county estimates are pulled toward the group mean, with small-sample counties shrinking more. Let’s visualize this.

# Reconstruct county intercepts from trace
alphas = Exmc.Benchmark.RadonModel.reconstruct_alphas(trace, data)

# Compute posterior means for each county
county_means =
  Enum.map(0..(data.n_counties - 1), fn j ->
    samples = Nx.to_flat_list(alphas[j])
    mean = Enum.sum(samples) / length(samples)
    n_obs = Enum.count(data.county_idx, &(&1 == j))
    %{county: j, posterior_mean: mean, true_alpha: Enum.at(data.true_alphas, j), n_obs: n_obs}
  end)

alias VegaLite, as: Vl

# Shrinkage plot: posterior mean vs true alpha, sized by county sample size
Vl.new(width: 600, height: 400, title: "County Intercept Shrinkage")
|> Vl.data_from_values(county_means)
|> Vl.mark(:circle, opacity: 0.7)
|> Vl.encode_field(:x, "true_alpha", type: :quantitative, title: "True County Intercept")
|> Vl.encode_field(:y, "posterior_mean", type: :quantitative, title: "Posterior Mean")
|> Vl.encode_field(:size, "n_obs", type: :quantitative, title: "# Observations")
|> Vl.encode_field(:color, "n_obs",
  type: :quantitative,
  scale: %{scheme: "viridis"},
  title: "# Observations"
)

# Grand mean line: alpha_j should cluster around mu_alpha + gamma_u * uranium_j
mu_alpha_mean =
  Nx.to_flat_list(trace["mu_alpha"]) |> then(&(Enum.sum(&1) / length(&1)))

gamma_u_mean =
  Nx.to_flat_list(trace["gamma_u"]) |> then(&(Enum.sum(&1) / length(&1)))

IO.puts("Grand mean: mu_alpha=#{Float.round(mu_alpha_mean, 3)}, gamma_u=#{Float.round(gamma_u_mean, 3)}")
IO.puts("Small counties shrink toward the group mean; large counties keep their local estimate.")

Floor Effect

Basement measurements (floor=0) should show higher radon than first-floor (floor=1). The beta parameter captures this — a negative value means first floor has lower radon.

beta_samples = Nx.to_flat_list(trace["beta"])

Vl.new(width: 500, height: 300, title: "Floor Effect (beta) Posterior")
|> Vl.data_from_values(Enum.map(beta_samples, &%{beta: &1}))
|> Vl.mark(:bar)
|> Vl.encode_field(:x, "beta", type: :quantitative, bin: %{maxbins: 40}, title: "beta")
|> Vl.encode(:y, aggregate: :count, type: :quantitative)

Other notebooks:

Dr. Christian Geuer-Pollmann
@chgeuer

livebook_on_azure

Christian's first LiveBook test

notebook1.livemd

tutorial advanced data-science axon exla nx

2022-8-18
@andyl

elix_util

MNIST

mnist.livemd

tutorial advanced data-science req axon exla nx

2022-8-18
@TomBers

livebookNotes

Trying Nx

NX.livemd

advanced data-science exla axon nx

2022-8-18
profiq
@profiq

elixir-ml-example

test_model

test.livemd

advanced data-science exla axon nx

2022-8-18
Ammar Massoud
@ammar-mohamed-massoud

Elixir-DockYard

Rock Paper Scissors

rock_paper_scissors.livemd

tutorial beginner jason kino youtube hidden_cell

2026-7-7
Ammar Massoud
@ammar-mohamed-massoud

Dockyard-Academy

Shopping List

shopping_list.livemd

tutorial beginner jason kino youtube hidden_cell

2026-7-10
Cocoa
@cocoa-xu

tflite_elixir

Audio classification with TensorFlow Lite

audio_classification.livemd

tutorial advanced data-science nx_signal tflite_elixir nx kino req

2026-7-6

Back