Powered by AppSignal & Oban Pro

Custom Ollama Adapter

custom-ollama-adapter.livemd

Custom Ollama Adapter

Mix.install(
  [
    {:instructor_lite, "~> 1.1"},
    {:req, "~> 0.5"}
  ]
)

About Ollama

Ollama project provides a way to run open source LLMs locally, provided you have capable hardware.

To run a model, just pick one from the list and run CLI command:

> ollama run deepseek-r1:8b
pulling manifest
pulling e6a7edc1a4d7: 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████▏ 5.2 GB
pulling c5ad996bda6e: 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████▏  556 B
pulling 6e4c38e1172f: 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████▏ 1.1 KB
pulling ed8474dc73db: 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████▏  179 B
pulling f64cd5418e4b: 100% ▕███████████████████████████████████████████████████████████████████████████████████████████████▏  487 B
verifying sha256 digest
writing manifest
success

The open source models come in so many shapes and sizes that it is impractical to try to fit them all into a single adapter. But have no fear, as InstructorLite makes it simple to write one for your specific needs!

Custom Adapter

For this exercise, we’ll write a custom adapter for Ollama running the deepseek-r1 model.

To do this, we need to create a module implementing the InstructorLite.Adapter behaviour, which requires four callbacks. Let’s go through them one by one.

The first callback is send_request/2, which is responsible for making a call with params to some kind of API endpoint. Looking at the Ollama docs, chat completion seems like a good endpoint to aim for, but there’s no need to hardcode it in the adapter. We’ll just require callers to provide it in full, including hostname and port. Since this is a model running locally, it will likely be something like http://localhost:11434/api/chat.

defmodule OllamaAdapterSendRequest do
  def send_request(params, options) do
    # Remember, adapter-specific options are nested under `adapter_context` key
    url = options |> Keyword.get(:adapter_context, []) |> Keyword.fetch!(:url)

    case Req.post(url, json: params) do
      {:ok, %{status: 200, body: body}} -> {:ok, body}
      {:ok, response} -> {:error, response}
      {:error, reason} -> {:error, reason}
    end
  end
end
{:module, OllamaAdapterSendRequest, <<70, 79, 82, 49, 0, 0, 8, ...>>, {:send_request, 2}}

The next callback is initial_prompt/2. This one enriches user-provided params with whatever is necessary to convince the LLM to fill the JSON schema. This part is very model-specific, so we need to consult the docs.

Fast forward, after reading the model card, the chat completion endpoint documentation and some explorative testing, we see that the endpoint supports structured output and all we need to do is to provide json schema in format parameter. Okay, let’s implement the initial_prompt/2 callback with this knowledge.

defmodule OllamaAdapterInitialPrompt do
  def initial_prompt(params, opts) do
    # This prompt will be a message with a "system" role
    sys_message = [
      %{
        role: "system",
        content: InstructorLite.Prompt.prompt(opts)
      }
    ]

    params
    # This adapter might work with different models,
    # so let's put a default one but now insist on it
    |> Map.put_new(:model, "deepseek-r1:8b")
    |> Map.put(:stream, false)
    # The user has likely provided their own prompt, so we need to be nice and not overwrite it
    |> Map.update(:messages, sys_message, fn msgs -> sys_message ++ msgs end)
    |> Map.put(:format, Keyword.fetch!(opts, :json_schema))
  end
end
{:module, OllamaAdapterInitialPrompt, <<70, 79, 82, 49, 0, 0, 9, ...>>, {:initial_prompt, 2}}

At this point, we can actually try to combine both callbacks and see if we can get a response.

json_schema = InstructorLite.JSONSchema.from_ecto_schema(%{name: :string})
prompt = OllamaAdapterInitialPrompt.initial_prompt(%{messages: [
  %{role: "user", content: "Who was the first president of the united states?"}
]}, json_schema: json_schema)
%{
  messages: [
    %{
      role: "system",
      content: "You're called by an Elixir application through the InstructorLite library. Your task is to understand what the application wants you to do and respond with JSON output that matches the schema. The output will be validated by the application against an Ecto schema and potentially some custom rules. You may be asked to adjust your response if it doesn't pass validation. "
    },
    %{role: "user", content: "Who was the first president of the united states?"}
  ],
  stream: false,
  format: %{
    type: "object",
    title: "root",
    required: [:name],
    additionalProperties: false,
    properties: %{name: %{type: "string"}}
  },
  model: "deepseek-r1:8b"
}
url = "http://localhost:11434/api/chat"
{:ok, response} = OllamaAdapterSendRequest.send_request(prompt, adapter_context: [url: url])
{:ok,
 %{
   "created_at" => "2025-08-25T01:19:07.12172Z",
   "done" => true,
   "done_reason" => "stop",
   "eval_count" => 14,
   "eval_duration" => 1404543334,
   "load_duration" => 81403708,
   "message" => %{
     "content" => "{\n  \"name\": \"George_Washington\"\n}\n\n  ",
     "role" => "assistant"
   },
   "model" => "deepseek-r1:8b",
   "prompt_eval_count" => 84,
   "prompt_eval_duration" => 342887500,
   "total_duration" => 1831366625
 }}

Great! This brings us to the parse_response/2 callback. Its job is to extract a parsed JSON object from the raw response, which we’ll subsequently attempt to cast to our Ecto schema.

defmodule OllamaAdapterParseResponse do
  def parse_response(response, _opts) do
    case response do
      %{"message" => %{"content" => json}} ->
        InstructorLite.JSON.decode(json)

      other ->
        {:error, :unexpected_response, other}
    end
  end
end

{:ok, params} = OllamaAdapterParseResponse.parse_response(response, [])
{:ok, %{"name" => "George_Washington"}}

That is actually enough for a happy path! But there is one more callback that deals with cases when the output doesn’t quite adhere to the schema. Imagine if we received a different response from a model:

hypothetical = %{"name" => false}
changeset = Ecto.Changeset.cast({%{}, %{name: :string}}, hypothetical, [:name])
errors = InstructorLite.ErrorFormatter.format_errors(changeset)
"name - is invalid"

We need to retry our query and relay the errors to the model, so it can correct its previous answer. retry_callback/5 is the biggest callback of all, simply because it can potentially need so much context to do that. It needs to know what params were, which data we attempted to cast and which errors we saw, as well as the raw response and, of course, options. For this adapter, we will only need params, errors and raw response.

defmodule OllamaAdapterRetryPrompt do
  def retry_prompt(params, _resp_params, errors, response, _opts) do
    message = Map.fetch!(response, "message")

    do_better = [
      message,
      %{
        role: "user",
        content: InstructorLite.Prompt.validation_failed(errors)
      }
    ]

    Map.update(params, :messages, do_better, fn msgs -> msgs ++ do_better end)
  end
end

retry_prompt = OllamaAdapterRetryPrompt.retry_prompt(prompt, nil, errors, response, [])
%{
  messages: [
    %{
      role: "system",
      content: "You're called by an Elixir application through the InstructorLite library. Your task is to understand what the application wants you to do and respond with JSON output that matches the schema. The output will be validated by the application against an Ecto schema and potentially some custom rules. You may be asked to adjust your response if it doesn't pass validation. "
    },
    %{role: "user", content: "Who was the first president of the united states?"},
    %{"content" => "{\n  \"name\": \"George_Washington\"\n}\n\n  ", "role" => "assistant"},
    %{
      role: "user",
      content: "The response did not pass validation. Please try again and fix the following validation errors:\n\nname - is invalid\n"
    }
  ],
  stream: false,
  format: %{
    type: "object",
    title: "root",
    required: [:name],
    additionalProperties: false,
    properties: %{name: %{type: "string"}}
  },
  model: "deepseek-r1:8b"
}

This retry prompt can now be used for a send_request/2 call. We won’t do it though, to avoid gaslighting the model by suggesting its answer was incorrect.

Optional Callbacks

If you plan to use your adapter for non-structured output through InstructorLite.ask/2 function, you’ll need to also implement an optional c:InstructorLite.Adapter.find_output/2 callback. It’s role is simply to find output in the response. In our case, the callback is exactly the same as parse_response/2, with the only difference is that it doesn’t need to parse JSON.

defmodule OllamaAdapterFindOutput do
  def find_output(response, _opts) do
    case response do
      %{"message" => %{"content" => output}} ->
        {:ok, output}

      other ->
        {:error, :unexpected_response, other}
    end
  end
end

{:ok, params} = OllamaAdapterFindOutput.find_output(response, [])
{:ok, "{\n  \"name\": \"George_Washington\"\n}\n\n  "}

Tying It All Together

Now we can put all these callbacks into a single module, confess to the compiler that our intention is to implement a behavior, and let the InstructorLite main interface handle all the logistics.

defmodule OllamaAdapter do
  @behaviour InstructorLite.Adapter

  @impl InstructorLite.Adapter
  def send_request(params, options) do
    url = options |> Keyword.get(:adapter_context, []) |> Keyword.fetch!(:url)

    case Req.post(url, json: params, receive_timeout: 120_000) do
      {:ok, %{status: 200, body: body}} -> {:ok, body}
      {:ok, response} -> {:error, response}
      {:error, reason} -> {:error, reason}
    end
  end

  @impl InstructorLite.Adapter
  def initial_prompt(params, opts) do
    sys_message = [
      %{
        role: "system",
        content: InstructorLite.Prompt.prompt(opts)
      }
    ]

    params
    |> Map.put_new(:model, "deepseek-r1:8b")
    |> Map.put(:stream, false)
    |> Map.update(:messages, sys_message, fn msgs -> sys_message ++ msgs end)
    |> Map.put(:format, Keyword.fetch!(opts, :json_schema))
  end

  @impl InstructorLite.Adapter
  def parse_response(response, opts) do
    with {:ok, json} <- find_output(response, opts) do
      InstructorLite.JSON.decode(json)
    end
  end

  @impl InstructorLite.Adapter
  def find_output(response, _opts) do
    case response do
      %{"message" => %{"content" => output}} ->
        {:ok, output}

      other ->
        {:error, :unexpected_response, other}
    end
  end

  @impl InstructorLite.Adapter
  def retry_prompt(params, _resp_params, errors, response, _opts) do
    message = Map.fetch!(response, "message")

    do_better = [
      message,
      %{
        role: "user",
        content: InstructorLite.Prompt.validation_failed(errors)
      }
    ]

    Map.update(params, :messages, do_better, fn msgs -> msgs ++ do_better end)
  end
end

InstructorLite.instruct(%{
    messages: [
      %{role: "user", content: "Dr. John Doe is forty two"}
    ],
  },
  adapter: OllamaAdapter,
  notes: "Trim down all people's titles",
  response_model: %{name: :string, age: :integer},
  adapter_context: [
    url: "http://localhost:11434/api/chat"
  ]
)
{:ok, %{name: "John Doe", age: 42}}

Note how we used find_output/2 in parse_response/2 to DRY our code a little.

Let’s now try ask:

InstructorLite.ask(%{
    model: "deepseek-r1:8b",
    stream: false,
    think: false,
    messages: [
      %{role: "user", content: "Cite me the greatest opening line in the history of cyberpunk. No yapping, just the line."}
    ],
  },
  adapter: OllamaAdapter,
  adapter_context: [
    url: "http://localhost:11434/api/chat"
  ]
)
{:ok, "\" All secrets are best kept cool.\" - Neuromancer"}