NomicBERT text embeddings on Emily

notebooks/nomic_embeddings.livemd

@ausimian

emily

Share to X

Share to Bluesky

More notebooks

NomicBERT text embeddings on Emily

Mix.install(
  [
    {:emily, "~> 0.4"},
    {:bumblebee, "~> 0.7"},
    {:tokenizers, "~> 0.5"},
    {:nx, "~> 0.12"},
    {:kino, "~> 0.14"}
  ],
  config: [
    nx: [default_backend: Emily.Backend]
  ]
)

Overview

This notebook runs the nomic-ai/nomic-embed-text-v1 encoder on Emily.Backend to produce sentence embeddings. NomicBERT is one of the three new model families shipped with Bumblebee 0.7. It’s a long-context (8192-token) BERT variant with rotary position embeddings and SwiGLU FFNs — used as a drop-in replacement for sentence-transformers style embedders.

The integration with Emily is the Mix.install config above; no further setup is required.

Loading the model

{:ok, model_info} =
  Bumblebee.load_model({:hf, "nomic-ai/nomic-embed-text-v1"},
    module: Bumblebee.Text.NomicBert,
    architecture: :base
  )

{:ok, tokenizer} =
  Bumblebee.load_tokenizer({:hf, "nomic-ai/nomic-embed-text-v1"})

The checkpoint is ~550 MB on first fetch. module: and architecture: are passed explicitly because the upstream config predates the Bumblebee 0.7 auto-detect mapping for this repo.

Building an embedding serving

serving =
  Bumblebee.Text.text_embedding(model_info, tokenizer,
    output_attribute: :hidden_state,
    output_pool: :mean_pooling,
    embedding_processor: :l2_norm,
    defn_options: [compiler: Emily.Compiler]
  )

NomicBERT’s :base graph returns :hidden_state, so the serving applies mean pooling over the sequence axis and then L2-normalises — matching the recipe the upstream sentence-transformers adapter uses. Emily.Compiler pins the result backend to Emily.Backend and caps partition concurrency at 1.

Embedding a few texts

texts = [
  "search_document: Elixir is a functional language for the BEAM.",
  "search_document: Rust is a systems language with strict ownership rules.",
  "search_query: which language runs on the BEAM?"
]

[doc_a, doc_b, query] =
  for %{embedding: e} <- Nx.Serving.run(serving, texts), do: e

NomicBERT expects the search_document: / search_query: prefix on every input — without it the query/document embedding spaces don’t align and similarities collapse. See the model card for the full prefix list (classification:, clustering:, etc).

Cosine similarity

defmodule Cosine do
  def sim(a, b) do
    a
    |> Nx.multiply(b)
    |> Nx.sum()
    |> Nx.to_number()
  end
end

%{
  query_vs_elixir: Cosine.sim(query, doc_a),
  query_vs_rust: Cosine.sim(query, doc_b)
}

Because the embeddings are L2-normalised, the dot product is the cosine similarity. The Elixir document should score noticeably higher than the Rust one.

Telemetry

Emily emits :telemetry events at the evaluation boundary. Attach a handler to sample timing for each forward pass:

:telemetry.attach(
  "nomic-embed",
  [:emily, :eval, :stop],
  fn _event, %{duration: duration}, _meta, _config ->
    ms = System.convert_time_unit(duration, :native, :millisecond)
    IO.puts("eval #{ms} ms")
  end,
  nil
)

Nx.Serving.run(serving, "search_query: how fast is Emily?")

See Emily.Telemetry for the full event catalogue, including the [:emily, :block, :fallback] event that fires whenever an op routes through Nx.BinaryBackend.

Other notebooks:

Michal Slaski
@michalslaski

livebook_examples

Salary predictions

salary_prediction.livemd

data-science advanced exla axon nx

2022-8-18
Dr. Christian Geuer-Pollmann
@chgeuer

livebook_on_azure

Christian's first LiveBook test

notebook1.livemd

data-science advanced tutorial axon exla nx

2022-8-18
@andyl

elix_util

MNIST

mnist.livemd

data-science advanced tutorial req axon exla nx

2022-8-18
Yejun Su
@goofansu

ogp

ogp

ogp.livemd

tutorial intermediate ogp kino

2022-8-18
@DockYard-Academy

curriculum

Public Chat API

deprecated_public_chat_api.livemd

tutorial intermediate jason kino_db httpoison kino youtube hidden_cell

2023-1-23
Nicolò G.
@nickgnd

programming-machine-learn...

Chapter 5: A Discerning Machine

classifier.livemd

advanced data-science tutorial exla nx vega_lite kino kino_vega_lite

2023-3-14
Shozo Fukuda
@shoz-f

nn-interp

Generative Inpainting - AOT-GAN

AOT-GAN.livemd

advanced tutorial nn_interp cimg kino

2024-9-20

Back