Tutorial
Mix.install([
{:openai_responses, "~> 0.3.0"},
# {:openai_responses, path: "~/src/openai_responses"},
{:kino, "~> 0.11.0"}
])
Introduction
The only setup you need for using the library is to get your OpenAI API token. If you already have the OPENAI_API_KEY
environment variable set, then you can start right away.
alias OpenAI.Responses
alias OpenAI.Responses.Helpers
Basic usage
create/1
requires a keyword list with at least two arguments: the model name and the input text:
{:ok, response} = Responses.create(model: "gpt-4.1-nano", input: "Write a haiku about programming")
The response
is just a map, and you can use helper functions to extract information from it:
Helpers.has_refusal?(response)
Helpers.output_text(response)
Helpers.token_usage(response)
Helpers.calculate_cost(response)
You can also supply additional parameters to the API call using the keyword list:
{:ok, response} =
Responses.create(
model: "gpt-4.1-nano",
input: "Do you need semicolons in Elixir",
instructions: "Talk like a pirate"
)
IO.puts Helpers.output_text(response)
A structured input can be manually constructed and passed to create/1
:
{:ok, response} =
Responses.create(
model: "gpt-4.1-nano",
input: [
%{role: "user", content: "knock knock."},
%{role: "assistant", content: "Who's there?"},
%{role: "user", content: "Orange."}
]
)
IO.puts Helpers.output_text(response)
input = [
%{
"role" => "user",
"content" => [
%{"type" => "input_text", "text" => "What is in this image?"},
%{
"type" => "input_image",
"image_url" => "https://upload.wikimedia.org/wikipedia/commons/d/d2/Three_early_medicine_bottles.jpg"
}
]
}
]
{:ok, response} = OpenAI.Responses.create(model: "gpt-4.1-nano", input: input)
IO.puts Helpers.output_text(response)
Image helpers
As we saw in the previous section, you can manually create a structured input with images, but this requires writing verbose JSON-like structures. The library provides helper functions to make this process more ergonomic.
# Using the helper function to create a message with an image
input_message = Helpers.create_message_with_images(
"What is in this image?",
"https://upload.wikimedia.org/wikipedia/commons/d/d2/Three_early_medicine_bottles.jpg"
)
# The helper creates the same structure as the manual approach, but with less code
input_message
You can also specify multiple images with different detail levels:
multi_image_message = Helpers.create_message_with_images(
"Compare these two images",
[
{"https://upload.wikimedia.org/wikipedia/commons/d/d2/Three_early_medicine_bottles.jpg", "high"},
"https://upload.wikimedia.org/wikipedia/commons/4/48/Cocacolacollection.JPG"
],
detail: "low" # Default detail level for images without a specific level
)
# And then use it with the API
{:ok, response} = OpenAI.Responses.create(model: "gpt-4.1-nano", input: [multi_image_message])
IO.puts Helpers.output_text(response)
Local image files are also supported and will be automatically encoded as base64 data URLs:
# This would work if you have these image files locally
# local_image_message = Helpers.create_message_with_images(
# "Describe these local images",
# ["path/to/image1.jpg", "path/to/image2.png"]
# )
The helper function eliminates boilerplate code, handles encoding of local images, and provides a more intuitive interface for working with images in your prompts.
Using built-in tools
The usage of built-in tools can be illustrated by the following example:
{:ok, response_no_tools} = Responses.create(model: "gpt-4.1-nano", input: "What's the weather in San Francisco?")
IO.puts(Helpers.output_text(response_no_tools))
{:ok, response_with_search} =
Responses.create(model: "gpt-4.1-mini", input: "What's the weather in San Francisco?",
tools: [%{type: "web_search_preview"}],
temperature: 0.7
)
IO.puts(Helpers.output_text(response_with_search))
Stored conversation state
The new Responses API makes it much easier to continue the conversation, without repeating the whole conversation story. Note that responses are stored by default.
One thing to be mindful about: when using along with previous_response_id
, the instructions from a previous response will not be carried over to the next response.
{:ok, response} =
Responses.create(
model: "gpt-4.1-nano",
input: [
%{
role: "developer",
content: "Talk like a pirate in all your responses, including all the interesting facts."
},
%{
role: "user",
content: """
Give me the names of the first 6 U.S. presidents. Include their names, dates of birth and death,
and information on their terms. Provide some interesting facts about each
"""
}
]
)
Helpers.output_text(response) |> IO.puts()
Now we can get information about the next 6 presidents, without explicitly repeating our prompt, by refering to the response["id"]
from the previous response:
{:ok, next_response} =
Responses.create(model: "gpt-4.1-nano", input: "next 6", previous_response_id: response["id"])
Helpers.output_text(next_response) |> IO.puts()
{:ok, next_response2} =
Responses.create(model: "gpt-4.1-nano", input: "next 6", previous_response_id: next_response["id"])
{:ok, next_response3} =
Responses.create(model: "gpt-4.1-nano", input: "next 6", previous_response_id: next_response2["id"])
Let’s compare the token usage of these two requests. As you can see, even though the explicit prompt we supplied was very short, OpenAI still takes into account all the relevant input tokens from the prompt. However, for system prompts over 1024 bytes, OpenAI enables caching automatically, which can be seen below:
IO.puts inspect(Helpers.token_usage(response), pretty: true)
IO.puts inspect(Helpers.token_usage(next_response), pretty: true)
IO.puts inspect(Helpers.token_usage(next_response3), pretty: true)
Structured outputs
Structured output ensures the model always generates responses that adhere to your supplied JSON Schema.
alias OpenAI.Responses.Schema
# Define a schema for a calendar event
calendar_event_schema = Schema.object(%{
name: :string,
date: :string,
participants: {:array, :string}
})
# Parse a response with structured output
{:ok, result} = Responses.parse(
calendar_event_schema,
model: "gpt-4.1-nano",
input: "Alice and Bob are going to a science fair on Friday.",
schema_name: "event"
)
result.parsed
# Define a more complex schema with nested objects
math_reasoning_schema = Schema.object(%{
steps: {:array, Schema.object(%{
explanation: :string,
output: :string
})},
final_answer: :string
})
# Parse with the complex schema
{:ok, solution} = Responses.parse(
math_reasoning_schema,
model: "gpt-4.1-mini",
input: "Solve 8x + 7 = -23"
)
solution.parsed
The returned map also defines :raw_response
and :token_usage
if you need to get additional metadata from the LLM provider.
solution.raw_response
Streaming responses
OpenAI.Responses supports true streaming, where you can process chunks as they arrive without waiting for the entire response to complete.
Real-time text streaming
This example demonstrates how to display text as it arrives in real-time using Kino.Frame:
frame = Kino.Frame.new()
Kino.render(frame)
# Create a stream from OpenAI
stream = Responses.stream(model: "gpt-4.1-mini", input: "Write a short poem about coding in Elixir")
# Extract text deltas - no need for initializing a stream handler
text_stream = Responses.text_deltas(stream)
Kino.Frame.append(frame, Kino.Markdown.new("## Poem about coding\n"))
# Process the stream
text_stream
|> Stream.each(fn delta ->
Kino.Frame.append(frame, Kino.Markdown.new(delta, chunk: true))
end)
|> Stream.run()
Kino.Frame.append(frame, Kino.Markdown.new("\n\n*Generation complete* ✨"))
:done