Notesclub

created by hec & contributors

terms privacy

Scalar Optimization with Scholar.Optimize

notebooks/optimize.livemd

Numerical Elixir (Nx)

@elixir-nx

scholar

Share to X

Share to Bluesky

More notebooks

Scalar Optimization with Scholar.Optimize

Mix.install([
  {:scholar, path: "."},
  {:nx, "~> 0.9"},
  {:kino_vega_lite, "~> 0.1"}
])

Introduction

Scholar provides two algorithms for scalar (univariate) function minimization:

Scholar.Optimize.Brent - Brent’s method (recommended)
Scholar.Optimize.GoldenSection - Golden Section search

Both find the minimum of a function of one variable within a specified bracket [a, b].

Recommendation: Use Brent’s method for most applications. It combines the reliability of Golden Section with faster convergence from parabolic interpolation, typically requiring 3-5x fewer function evaluations.

Brent’s Method (Recommended)

Brent’s method is the default choice for scalar optimization. It intelligently combines:

Parabolic interpolation for fast convergence when the function is smooth
Golden section fallback for guaranteed convergence

Simple Parabola

Let’s minimize $f(x) = (x - 3)^2$, which has its minimum at $x = 3$.

alias Scholar.Optimize.Brent

# Define the objective function
fun = fn x -> Nx.pow(Nx.subtract(x, 3), 2) end

# Find the minimum using Brent's method
result = Brent.minimize(0.0, 5.0, fun)

IO.puts("Minimum found at x = #{Nx.to_number(result.x)}")
IO.puts("Function value: #{Nx.to_number(result.fun)}")
IO.puts("Converged: #{Nx.to_number(result.converged) == 1}")
IO.puts("Iterations: #{Nx.to_number(result.iterations)}")
IO.puts("Function evaluations: #{Nx.to_number(result.fun_evals)}")

Minimizing a Trigonometric Function

Find the minimum of $\sin(x)$ in the interval $[3, 5]$ (around $3\pi/2$):

alias Scholar.Optimize.Brent

fun = fn x -> Nx.sin(x) end

result = Brent.minimize(3.0, 5.0, fun)

expected_x = 3 * :math.pi() / 2  # 3π/2 ≈ 4.712

IO.puts("Minimum found at x = #{Nx.to_number(result.x)}")
IO.puts("Expected: #{expected_x}")
IO.puts("sin(x) at minimum: #{Nx.to_number(result.fun)}")

Shifted Parabola

Minimize $f(x) = (x + 2)^2 + 1$, which has minimum at $x = -2$ with value $1$:

alias Scholar.Optimize.Brent

fun = fn x -> Nx.add(Nx.pow(Nx.add(x, 2), 2), 1) end

result = Brent.minimize(-5.0, 5.0, fun)

IO.puts("Minimum found at x = #{Nx.to_number(result.x)}")
IO.puts("Function value: #{Nx.to_number(result.fun)}")
IO.puts("Expected: x = -2, f(x) = 1")

Brent vs Golden Section: Performance Comparison

Brent’s method typically converges much faster than Golden Section:

alias Scholar.Optimize.{Brent, GoldenSection}

fun = fn x -> Nx.pow(Nx.subtract(x, 3), 2) end

brent_result = Brent.minimize(0.0, 5.0, fun)
golden_result = GoldenSection.minimize(0.0, 5.0, fun)

IO.puts("Simple parabola (x-3)^2 on [0, 5]:")
IO.puts("  Brent:         #{Nx.to_number(brent_result.fun_evals)} function evaluations")
IO.puts("  Golden Section: #{Nx.to_number(golden_result.fun_evals)} function evaluations")
IO.puts("")
IO.puts("Both find x ≈ 3.0:")
IO.puts("  Brent x:  #{Nx.to_number(brent_result.x)}")
IO.puts("  Golden x: #{Nx.to_number(golden_result.x)}")

Wider Bracket Comparison

The advantage of Brent becomes more pronounced with wider brackets:

alias Scholar.Optimize.{Brent, GoldenSection}

fun = fn x -> Nx.pow(Nx.subtract(x, 50), 2) end

brent_result = Brent.minimize(0.0, 100.0, fun)
golden_result = GoldenSection.minimize(0.0, 100.0, fun)

IO.puts("Parabola (x-50)^2 on [0, 100]:")
IO.puts("  Brent:         #{Nx.to_number(brent_result.fun_evals)} function evaluations")
IO.puts("  Golden Section: #{Nx.to_number(golden_result.fun_evals)} function evaluations")
IO.puts("")
IO.puts("Speedup: #{Float.round(Nx.to_number(golden_result.fun_evals) / Nx.to_number(brent_result.fun_evals), 1)}x fewer evaluations")

How Brent’s Method Works

Brent’s method tracks six points during optimization:

a, b: Current bracket bounds
x: Best point (lowest function value)
w: Second best point
v: Previous value of w

At each iteration:

Try parabolic interpolation through x, w, v
If the parabolic step is acceptable (within bounds, making progress), use it
Otherwise, take a golden section step into the larger segment
Update the bracket and tracked points

This gives:

Superlinear convergence when the function is smooth (parabolic steps)
Guaranteed convergence when it’s not (golden section fallback)

alias Scholar.Optimize.Brent

fun = fn x -> Nx.pow(Nx.subtract(x, 1), 2) end

result = Brent.minimize(-5.0, 5.0, fun, tol: 1.0e-10, maxiter: 100)

IO.puts("Result from Brent.minimize:")
IO.puts("  x = #{Nx.to_number(result.x)}")
IO.puts("  f(x) = #{Nx.to_number(result.fun)}")
IO.puts("  Iterations: #{Nx.to_number(result.iterations)}")
IO.puts("  Function evaluations: #{Nx.to_number(result.fun_evals)}")

Golden Section Search

Golden Section is included primarily for educational purposes. It uses the golden ratio $\phi = \frac{\sqrt{5} - 1}{2} \approx 0.618$ to select probe points.

Key properties:

Guaranteed convergence for unimodal functions
Bracket width decreases by factor $\phi \approx 0.618$ per iteration
Linear convergence rate
Requires only function evaluations (no derivatives)

alias Scholar.Optimize.GoldenSection

fun = fn x -> Nx.pow(Nx.subtract(x, 1), 2) end

result = GoldenSection.minimize(-5.0, 5.0, fun, tol: 1.0e-10, maxiter: 100)

IO.puts("Result from GoldenSection.minimize:")
IO.puts("  x = #{Nx.to_number(result.x)}")
IO.puts("  f(x) = #{Nx.to_number(result.fun)}")
IO.puts("  Iterations: #{Nx.to_number(result.iterations)}")
IO.puts("  Function evaluations: #{Nx.to_number(result.fun_evals)}")

GPU/JIT Compatibility

Both methods are implemented as pure defn functions, making them fully compatible with JIT compilation and GPU execution:

alias Scholar.Optimize.Brent

fun = fn x -> Nx.pow(Nx.subtract(x, 3), 2) end
opts = [tol: 1.0e-8, maxiter: 500]

# This works with jit_apply for GPU acceleration
result = Nx.Defn.jit_apply(&amp;Brent.minimize/4, [0.0, 5.0, fun, opts])

IO.puts("JIT-compiled result:")
IO.puts("  x = #{Nx.to_number(result.x)}")
IO.puts("  Converged: #{Nx.to_number(result.converged) == 1}")

Options

Tolerance

Control convergence tolerance:

alias Scholar.Optimize.Brent

fun = fn x -> Nx.pow(Nx.subtract(x, 2), 2) end

# Default tolerance (1.0e-5)
result_default = Brent.minimize(0.0, 5.0, fun)

# Looser tolerance (faster)
result_loose = Brent.minimize(0.0, 5.0, fun, tol: 1.0e-3)

IO.puts("Default tolerance (1e-5): #{Nx.to_number(result_default.fun_evals)} function evals")
IO.puts("Loose tolerance (1e-3): #{Nx.to_number(result_loose.fun_evals)} function evals")

Maximum Iterations

Limit the number of iterations:

alias Scholar.Optimize.Brent

fun = fn x -> Nx.pow(Nx.subtract(x, 2), 2) end

result = Brent.minimize(0.0, 100.0, fun, maxiter: 5)

IO.puts("With maxiter: 5")
IO.puts("  Converged: #{Nx.to_number(result.converged) == 1}")
IO.puts("  Iterations used: #{Nx.to_number(result.iterations)}")

Higher Precision with f64

For higher precision, use f64 tensors for bounds:

alias Scholar.Optimize.Brent

fun = fn x -> Nx.pow(Nx.subtract(x, 3), 2) end

# f32 precision (default)
result_f32 = Brent.minimize(0.0, 5.0, fun)

# f64 precision
a = Nx.tensor(0.0, type: :f64)
b = Nx.tensor(5.0, type: :f64)
result_f64 = Brent.minimize(a, b, fun, tol: 1.0e-12)

IO.puts("f32 result: x = #{Nx.to_number(result_f32.x)}")
IO.puts("f64 result: x = #{Nx.to_number(result_f64.x)}")
IO.puts("f64 error:  #{abs(Nx.to_number(result_f64.x) - 3.0)}")

Result Struct

Both Brent.minimize/4 and GoldenSection.minimize/4 return structs with the same fields:

alias Scholar.Optimize.Brent

fun = fn x -> Nx.pow(Nx.subtract(x, 3), 2) end

result = Brent.minimize(0.0, 5.0, fun)

IO.puts("Result struct fields:")
IO.puts("  x:          #{Nx.to_number(result.x)} (solution)")
IO.puts("  fun:        #{Nx.to_number(result.fun)} (function value at solution)")
IO.puts("  converged:  #{Nx.to_number(result.converged)} (1 = yes, 0 = no)")
IO.puts("  iterations: #{Nx.to_number(result.iterations)}")
IO.puts("  fun_evals:  #{Nx.to_number(result.fun_evals)}")

When to Use Which Method

Method	Use Case
Brent	Default choice for all scalar optimization
Golden Section	Educational purposes, or when you need guaranteed linear convergence behavior

Both methods:

Work with any continuous unimodal function
Require no derivatives
Are JIT/GPU compatible
Guarantee convergence within the specified bracket

Other notebooks:

Michal Slaski
@michalslaski

livebook_examples

Salary predictions

salary_prediction.livemd

advanced data-science exla axon nx

2022-8-18
Dr. Christian Geuer-Pollmann
@chgeuer

livebook_on_azure

Christian's first LiveBook test

notebook1.livemd

tutorial advanced data-science axon exla nx

2022-8-18
@andyl

elix_util

MNIST

mnist.livemd

tutorial advanced data-science req axon exla nx

2022-8-18
@TomBers

livebookNotes

Trying Nx

NX.livemd

advanced data-science exla axon nx

2022-8-18
MijnVerbruik
@mijnverbruik

dsmr

Connect to DSMR meter

connect_to_dsmr_meter.livemd

tutorial advanced gen-server dsmr kino kino_vega_lite

2025-10-6
@DeSchoel

Elixir_Curriculum

Protocols

protocols.livemd

tutorial advanced jason kino youtube hidden_cell

2026-1-10
@DockYard-Academy

curriculum

In-Memory Todo List

deprecated_in-memory_todo_list.livemd

tutorial intermediate gen-server otp jason kino youtube hidden_cell

2023-4-14

Back