DistilBERT question answering on Emily

livebooks/distilbert_qa.livemd

@ausimian

emily

Share to X

Share to Bluesky

More notebooks

DistilBERT question answering on Emily

Mix.install(
  [
    {:emily, "~> 1.0"},
    {:bumblebee, "~> 0.7"},
    {:tokenizers, "~> 0.5"},
    {:nx, "~> 0.12"},
    {:kino, "~> 0.14"}
  ],
  config: [
    nx: [default_backend: Emily.Backend]
  ]
)

Overview

This notebook runs a DistilBERT question-answering pipeline on Emily.Backend. The backend is installed as the Nx global default by the Mix.install/2 config above, so every subsequent Nx call dispatches to MLX without further setup.

The featurizer, tokenizer, and model all come from Bumblebee. The only integration with Emily is the Mix.install config line and, optionally, the Emily.Compiler attachment further down.

Loading the model

{:ok, model_info} =
  Bumblebee.load_model({:hf, "distilbert-base-uncased-distilled-squad"})

{:ok, tokenizer} =
  Bumblebee.load_tokenizer({:hf, "distilbert-base-uncased-distilled-squad"})

The checkpoint is ~250 MB on first fetch; subsequent runs use the Bumblebee cache at ~/.cache/bumblebee.

Building a serving

serving =
  Bumblebee.Text.question_answering(model_info, tokenizer,
    defn_options: [compiler: Emily.Compiler, native: true, native_fallback: :raise]
  )

Emily.Compiler pins the result backend to Emily.Backend and caps partition concurrency at 1 (use Emily.Stream for per-process concurrency — see the other notebook). native: true lowers the whole forward through Emily’s native Expr compiler — one NIF replay per call rather than op-by-op dispatch — and native_fallback: :raise makes that fail loudly instead of silently degrading to the evaluator. This pipeline lowers fully native, so :raise never trips.

Running a query

context =
  "Elixir is a dynamic, functional programming language that runs on the Erlang VM. " <>
    "It was created by José Valim in 2011."

question = "Who created Elixir?"

Nx.Serving.run(serving, %{question: question, context: context})

The expected output is a map shaped like

%{
  results: [
    %{text: "José Valim", start: _, end: _, score: _}
  ]
}

Telemetry

Under native: true the forward is a single NIF replay, so the op-by-op [:emily, :eval, :stop] span never fires — there’s no per-op boundary to time. The native-compiler event to watch instead is [:emily, :compiler, :fallback]: a tripwire that fires only if an op can’t lower and routes through the evaluator. Attach it, run the forward, then read Emily.Memory.stats/0 (which itself emits [:emily, :memory, :stats]):

:telemetry.attach(
  "distilbert-qa-fallback",
  [:emily, :compiler, :fallback],
  fn _event, %{count: count}, %{reason: reason}, _config ->
    IO.puts("native fallback (#{count}): #{reason}")
  end,
  nil
)

Emily.Memory.reset_peak()
Nx.Serving.run(serving, %{question: question, context: context})
%{active: active, peak: peak} = Emily.Memory.stats()

IO.puts("no fallback above => forward lowered fully native")

IO.puts(
  "MLX memory — active #{div(active, 1024 * 1024)} MiB, " <>
    "peak #{div(peak, 1024 * 1024)} MiB"
)

See Emily.Telemetry for the full event catalogue, including the [:emily, :fallback, *] span that fires whenever an op routes through Nx.BinaryBackend.

Other notebooks:

Dr. Christian Geuer-Pollmann
@chgeuer

livebook_on_azure

Christian's first LiveBook test

notebook1.livemd

tutorial advanced data-science axon exla nx

2022-8-18
@andyl

elix_util

MNIST

mnist.livemd

tutorial advanced data-science req axon exla nx

2022-8-18
@TomBers

livebookNotes

Attractors

attractors.livemd

advanced data-science decimal vega_lite kino

2022-8-18
@TomBers

livebookNotes

Trying Nx

NX.livemd

advanced data-science exla axon nx

2022-8-18
Ammar Massoud
@ammar-mohamed-massoud

Dockyard-Academy

Code Editors

code_editors.livemd

tutorial beginner jason kino youtube hidden_cell

2026-7-13
@DominicThai

elixir_exercises

Module 3: Functions

module3.livemd

tutorial intermediate vega_lite kino_vega_lite pythonx kino_pythonx

2026-7-7
Nicolò G.
@nickgnd

programming-machine-learn...

Chapter 7: The Final Challenge

multiclass_classifier.livemd

tutorial advanced data-science exla nx vega_lite kino kino_vega_lite

2026-7-7

Back