Observability & Tracing
Section
Explore PtcRunner’s tracing layers interactively: in-memory Tracer and file-based TraceLog.
Setup
repo_root = Path.expand("..", __DIR__)
deps =
if File.exists?(Path.join(repo_root, "mix.exs")) do
[{:ptc_runner, path: repo_root}]
else
[{:ptc_runner, "~> 0.10.1"}]
end
Mix.install(deps ++ [{:req_llm, "~> 1.8"}, {:kino, "~> 0.14"}], consolidate_protocols: false)
local_path = Path.join(__DIR__, "llm_setup.exs")
if File.exists?(local_path) do
Code.require_file(local_path)
else
%{body: code} = Req.get!("https://raw.githubusercontent.com/andreasronge/ptc_runner/main/livebooks/llm_setup.exs")
Code.eval_string(code)
end
setup = LLMSetup.setup()
setup = LLMSetup.choose_provider(setup)
my_llm = LLMSetup.choose_model(setup)
Introduction
PtcRunner provides three complementary tracing layers:
| Layer | Purpose | Output |
|---|---|---|
| Tracer | Lightweight stats & inspection in code |
In-memory struct on Step |
| TraceLog | Offline analysis, Chrome DevTools flame charts | JSONL files |
SubAgent.Loop
├── builds Tracer struct ────► entries stored on Step.tracer
└── emits :telemetry events ──► TraceLog.Handler ──► Collector ──► .jsonl file
Each layer is independent. Use both, either, or neither.
1. In-Memory Tracing with Tracer
The Tracer struct lives on every Step. No setup required — it’s always there.
alias PtcRunner.SubAgent
agent = SubAgent.new(
prompt: "What are the three primary colors?",
signature: "() -> :string",
max_turns: 1
)
{:ok, step} = SubAgent.run(agent, llm: my_llm)
IO.puts("=== Result ===")
IO.puts(step.return)
IO.puts("\n=== Usage ===")
IO.inspect(step.usage, label: "usage")
IO.puts("\n=== Turn History ===")
for turn <- step.turns do
IO.puts("Turn #{turn.number}: #{String.slice(turn.program || "", 0, 80)}")
IO.puts(" Tools: #{inspect(Enum.map(turn.tool_calls, & &1.name))}")
IO.puts(" Success: #{turn.success?}")
end
step
Debug Trace
SubAgent.Debug.print_trace/2 gives a quick visual overview:
SubAgent.Debug.print_trace(step)
# Include token usage stats
SubAgent.Debug.print_trace(step, usage: true)
Trace Filtering
In production, you may want to collect traces only on failure:
# Only keep trace on failure (saves memory)
# SubAgent.run(agent, llm: my_llm, trace: :on_error)
# Disable tracing entirely
# SubAgent.run(agent, llm: my_llm, trace: false)
"See docs/guides/subagent-observability.md for details"
2. File-Based Tracing with TraceLog
Wrap any execution in TraceLog.with_trace/2 to capture a JSONL trace file.
alias PtcRunner.TraceLog
alias PtcRunner.TraceLog.Analyzer
# A multi-turn agent with a tool, so we get richer traces
calc_tool = fn %{"expression" => expr} ->
{result, _} = Code.eval_string(expr)
result
end
agent = SubAgent.new(
prompt: """
Calculate the following and return the final answer:
1. What is 17 * 23?
2. What is the result plus 100?
""",
signature: "{step1 :int, step2 :int}",
tools: %{
"calc" => {calc_tool,
signature: "(expression :string) -> :float",
description: "Evaluate a math expression. Example: (tool/calc {:expression \"2 + 3\"})"}
},
max_turns: 3,
timeout: 15_000
)
{:ok, {_result, trace_step}, trace_path} = TraceLog.with_trace(fn ->
{result, step} = SubAgent.run(agent, llm: my_llm, debug: true)
{result, step}
end)
IO.puts("Trace saved to: #{File.cwd!()}/#{trace_path}")
trace_path
SubAgent.Debug.print_trace(trace_step, raw: false)
Analyze the Trace
events = Analyzer.load(trace_path)
IO.puts("=== Summary ===")
summary = Analyzer.summary(events)
IO.inspect(summary, pretty: true)
IO.puts("\n=== Timeline ===")
Analyzer.print_timeline(events)
IO.puts("\n=== Slowest Operations ===")
for event <- Analyzer.slowest(events, 3) do
IO.puts(" #{event["event"]} - #{event["duration_ms"]}ms")
end
:ok
Filter Events
# Only LLM events
llm_events = Analyzer.filter(events, type: "llm")
IO.puts("LLM events: #{length(llm_events)}")
# Only tool events
tool_events = Analyzer.filter(events, type: "tool")
IO.puts("Tool events: #{length(tool_events)}")
# Events slower than 100ms
slow = Analyzer.filter(events, min_duration_ms: 100)
IO.puts("Slow events (>100ms): #{length(slow)}")
:ok
Context Compaction in Traces
Compaction (compaction: true on the agent) trims older messages from the LLM-input
history once turn count or estimated tokens cross a threshold, keeping recent turns
verbatim. When it fires, it emits a [:ptc_runner, :sub_agent, :compaction, :triggered]
telemetry event that lands in the trace. See the
Compaction guide for the full
option reference.
To make this deterministic in a livebook we drive a long multi-turn run with a stub LLM that answers each turn until it eventually returns. The default trigger fires once a multi-turn agent’s turn count exceeds 8.
# Stub LLM: ten "still working" turns, then a return on turn 11.
# Returning a string is the simplest shape the LLM callback contract accepts.
stub_llm =
fn %{turn: turn} ->
response =
if turn >= 11 do
"(return {:answer 42})"
else
"(println \"still working on turn #{turn}\")"
end
{:ok, response}
end
long_agent =
SubAgent.new(
prompt: "Run a long multi-turn task and eventually return {:answer 42}",
signature: "{answer :int}",
max_turns: 12,
compaction: true
)
{:ok, {_, compact_step}, compact_trace_path} =
TraceLog.with_trace(fn ->
{:ok, step} = SubAgent.run(long_agent, llm: stub_llm)
{:ok, step}
end)
IO.inspect(compact_step.usage.compaction, label: "step.usage.compaction")
The :triggered flag tells you whether trimming actually happened on this run, and
messages_before / messages_after show how many LLM-input messages were dropped.
reason is :turn_pressure or :token_pressure depending on which threshold fired
first.
Filter the trace to just the compaction firings:
compact_events = Analyzer.load(compact_trace_path)
compactions = Analyzer.filter(compact_events, type: "compaction")
for e <- compactions do
d = e["data"] || %{}
IO.puts(
"turn #{e["turn"]}: " <>
"#{d["messages_before"]} → #{d["messages_after"]} messages " <>
"(reason: #{d["reason"]})"
)
end
> When you open the trace in mix ptc.viewer, turns that fired compaction show a
> ✂N→M badge on their pill, and Chrome DevTools (next section) renders each firing
> as a flag marker on the timeline.
Chrome DevTools Export
Export the trace to Chrome’s flame chart format for visual analysis.
{:ok, tree} = Analyzer.load_tree(trace_path)
chrome_path = Path.rootname(trace_path) <> ".json"
:ok = Analyzer.export_chrome_trace(tree, chrome_path)
IO.puts("Chrome trace exported to: #{chrome_path}")
IO.puts("")
IO.puts("To view the flame chart:")
IO.puts(" 1. Open Chrome DevTools (F12) → Performance tab → Load profile...")
IO.puts(" 2. Or navigate to chrome://tracing and load the file")
chrome_path
The flame chart shows nested spans — each turn contains its LLM call, and the run span wraps everything:
Path.join([__DIR__, "..", "images", "chrome_tracing.png"])
|> Path.expand()
|> File.read!()
|> Kino.Image.new("image/png")
Interactive Trace Tree (Kino Widget)
If you ran the agent above, the step struct can be visualized with Kino.TraceTree:
PtcRunner.Kino.TraceTree.new(trace_step)
3. Cross-Process Tracing
When PTC-Lisp runs pmap or pcalls, tools execute in parallel sandbox processes.
TraceContext.capture/0 and TraceContext.attach/1 ensure all events are correlated
back to the parent trace — this happens automatically.
Here’s a demonstration with parallel tool calls:
alias PtcRunner.TraceLog
alias PtcRunner.TraceLog.Analyzer
# Tools that simulate parallel work
fetch_tool = fn %{"city" => city} ->
# Simulate some work
Process.sleep(50)
temps = %{"Paris" => 18, "Tokyo" => 24, "NYC" => 15}
%{city: city, temp_c: Map.get(temps, city, 20)}
end
agent = SubAgent.new(
prompt: """
Fetch the temperature for Paris, Tokyo, and NYC in parallel using pmap.
Then return the warmest city.
""",
signature: "() -> :string",
tools: %{
"fetch_temp" => {fetch_tool,
signature: "(city :string) -> {city :string, temp_c :number}",
description: "Get current temperature for a city."}
},
max_turns: 2,
timeout: 15_000
)
{:ok, {_result, pmap_step}, pmap_trace_path} = TraceLog.with_trace(fn ->
{result, step} = SubAgent.run(agent, llm: my_llm, debug: true)
{result, step}
end)
IO.puts("Result: #{pmap_step.return}")
IO.puts("Trace: #{pmap_trace_path}")
pmap_trace_path
Inspect the Parallel Trace
Even though tools ran in separate sandbox processes, the events appear in a single trace file with proper span correlation:
events = Analyzer.load(pmap_trace_path)
IO.puts("=== Timeline (parallel tools visible) ===")
Analyzer.print_timeline(events)
IO.puts("\n=== Tool Events ===")
tool_events = Analyzer.filter(events, type: "tool")
for event <- tool_events do
tool = get_in(event, ["metadata", "tool_name"]) || "?"
span = event["span_id"] || "none"
parent = event["parent_span_id"] || "root"
duration = event["duration_ms"]
IO.puts(" #{event["event"]} #{tool} (span=#{String.slice(span, 0, 8)}... parent=#{String.slice(parent, 0, 8)}...) #{if duration, do: "#{duration}ms", else: ""}")
end
:ok
Manual Cross-Process Propagation
For custom code that spawns tasks, use TraceContext directly:
alias PtcRunner.TraceContext
# This is what PtcRunner does internally for pmap/pcalls:
#
# # In the parent process:
# ctx = TraceContext.capture()
#
# Task.async(fn ->
# TraceContext.attach(ctx)
# # Events from here are now captured in the parent's trace
# end)
#
# The sandbox and parallel primitives handle this automatically,
# so you only need manual propagation for custom Task spawning.
"See TraceContext.capture/0 and TraceContext.attach/1"
5. Configuration
Tune tracing behavior via application config or per-call options.
Tracer (In-Memory)
# Limit in-memory entries to prevent unbounded growth
alias PtcRunner.Tracer
tracer = Tracer.new(max_entries: 50)
IO.inspect(tracer.max_entries, label: "max_entries")
TraceLog (JSONL)
# These can be set in config/config.exs:
#
# config :ptc_runner,
# trace_max_string_size: 128_000, # Max string size before truncation (default: 65_536)
# trace_max_list_size: 200, # Max list length before summarizing (default: 100)
# trace_preserve_full_keys: ["system_prompt"], # Keys never truncated
# trace_dir: "traces" # Default output directory
# Current defaults:
IO.puts("trace_max_string_size: #{Application.get_env(:ptc_runner, :trace_max_string_size, 65_536)}")
IO.puts("trace_max_list_size: #{Application.get_env(:ptc_runner, :trace_max_list_size, 100)}")
IO.puts("trace_dir: #{Application.get_env(:ptc_runner, :trace_dir, "(current directory)")}")
:ok
Learn More
- Observability Guide — full reference for all three layers
- SubAgent Getting Started
- Composition Patterns
-
PtcRunner.TraceLog.Analyzer— API docs for offline analysis -
PtcRunner.TraceContext— cross-process propagation API