Notesclub

created by hec & contributors

terms privacy

Getting Started with Stephen

livebook/getting_started.livemd

George Guimarães

@georgeguimaraes

stephen

Share to X

Share to Bluesky

More notebooks

Getting Started with Stephen

Mix.install([
  {:stephen, "~> 0.1.0"},
  {:exla, "~> 0.9"},
  {:kino, "~> 0.14"}
])

# Use EXLA for faster inference
Nx.global_default_backend(EXLA.Backend)

Introduction

Stephen is a ColBERT-style neural retrieval library for Elixir. Unlike traditional vector search that compresses each document into a single embedding, ColBERT keeps one embedding per token, enabling fine-grained semantic matching.

This notebook walks you through:

Loading the encoder
Indexing documents
Searching and understanding results
Reranking candidates
Debugging with explain

Load the Encoder

The encoder downloads the ColBERT model on first use (~500MB). This takes a minute the first time.

{:ok, encoder} = Stephen.load_encoder()

Index Some Documents

Let’s index some facts about late night talk show hosts:

documents = [
  {"colbert", "Stephen Colbert hosted The Colbert Report on Comedy Central before taking over The Late Show on CBS"},
  {"conan", "Conan O'Brien is known for his self-deprecating humor, tall hair, and pale complexion"},
  {"seth", "Seth Meyers was head writer at SNL before hosting Late Night on NBC"},
  {"oliver", "John Oliver hosts Last Week Tonight on HBO, focusing on investigative journalism"}
]

index = Stephen.new_index(encoder)
index = Stephen.index(encoder, index, documents)

Search

Now let’s search for relevant documents:

query = "late night comedy hosts"
results = Stephen.search(encoder, index, query, top_k: 3)

The results show document IDs ranked by their MaxSim scores. Higher scores mean better semantic matches.

Try Different Queries

queries = [
  "political satire and journalism",
  "SNL comedy writers",
  "tall comedian with red hair"
]

for query <- queries do
  results = Stephen.search(encoder, index, query, top_k: 2)

  IO.puts("\n🔍 Query: #{query}")
  for %{doc_id: id, score: score} <- results do
    IO.puts("   #{id}: #{Float.round(score, 2)}")
  end
end

:ok

Understanding Scores with Explain

Why did a document score the way it did? Use explain/3 to see token-level matching:

explanation = Stephen.explain(
  encoder,
  "political satire journalism",
  "John Oliver hosts Last Week Tonight on HBO, focusing on investigative journalism"
)

explanation
|> Stephen.Scorer.format_explanation()
|> IO.puts()

The explanation shows which query tokens matched which document tokens and their similarity scores. This helps debug unexpected rankings.

Reranking

Stephen excels at reranking candidates from a faster first-stage retriever (like BM25 or Postgres full-text search).

Rerank from Index

# Pretend these came from a keyword search
candidate_ids = ["conan", "oliver", "seth"]

Stephen.rerank(encoder, index, "investigative news show", candidate_ids)

Rerank Raw Text

No index needed for ad-hoc reranking:

candidates = [
  {"wiki1", "The Daily Show is an American late-night talk show"},
  {"wiki2", "60 Minutes is an investigative journalism program"},
  {"wiki3", "Last Week Tonight combines comedy with investigative journalism"}
]

Stephen.rerank_texts(encoder, "comedy investigative journalism", candidates)

Query Expansion with PRF

Pseudo-relevance feedback (PRF) expands queries using terms from top-ranked documents:

# Without PRF
basic = Stephen.search(encoder, index, "comedy writer", top_k: 2)
IO.inspect(basic, label: "Basic search")

# With PRF - finds related terms from top results
expanded = Stephen.search_with_prf(encoder, index, "comedy writer", top_k: 2)
IO.inspect(expanded, label: "With PRF")

PRF can improve recall when queries are short or ambiguous.

Interactive Search

Try your own queries:

query_input = Kino.Input.text("Search query", default: "funny host")

query = Kino.Input.read(query_input)

if query != "" do
  results = Stephen.search(encoder, index, query, top_k: 4)

  rows = Enum.map(results, fn %{doc_id: id, score: score} ->
    {_, text} = Enum.find(documents, fn {doc_id, _} -> doc_id == id end)
    %{id: id, score: Float.round(score, 2), text: text}
  end)

  Kino.DataTable.new(rows)
else
  "Enter a query above"
end

Saving and Loading Indexes

Persist indexes to disk:

# Save
path = Path.join(System.tmp_dir!(), "stephen_demo_index")
:ok = Stephen.save_index(index, path)
IO.puts("Saved to #{path}")

# Load
{:ok, loaded_index} = Stephen.load_index(path)

# Verify it works
Stephen.search(encoder, loaded_index, "comedy", top_k: 1)

Next Steps

See Index Types for larger collections
See Compression for memory-constrained environments
See Chunking for long documents

Cleanup

# Clean up temp files
File.rm_rf!(path)
:ok

Other notebooks:

Michal Slaski
@michalslaski

livebook_examples

Salary predictions

salary_prediction.livemd

data-science advanced exla axon nx

2022-8-18
Dr. Christian Geuer-Pollmann
@chgeuer

livebook_on_azure

Christian's first LiveBook test

notebook1.livemd

data-science advanced tutorial axon exla nx

2022-8-18
@andyl

elix_util

MNIST

mnist.livemd

data-science advanced tutorial req axon exla nx

2022-8-18
Yejun Su
@goofansu

ogp

ogp

ogp.livemd

tutorial intermediate ogp kino

2022-8-18
Bill Allen
@bwanab

Music_Build

Pattern_Eval

pattern_eval.livemd

advanced data-science tutorial music_prims midifile music_build better_weighted_random

2025-5-30
Nick C
@flowerett

aoc

Day19

day19.livemd

advanced tutorial kino_aoc benchee nimble_parsec libgraph math

2023-12-25
Nick C
@flowerett

aoc

Day7

day7.livemd

advanced testing kino_aoc benchee nimble_parsec libgraph

2023-12-11

Back