Scalar Optimization with Scholar.Optimize
Mix.install([
{:scholar, path: "."},
{:nx, "~> 0.9"},
{:kino_vega_lite, "~> 0.1"}
])
Introduction
Scholar provides two algorithms for scalar (univariate) function minimization:
-
Scholar.Optimize.Brent- Brent’s method (recommended) -
Scholar.Optimize.GoldenSection- Golden Section search
Both find the minimum of a function of one variable within a specified bracket [a, b].
Recommendation: Use Brent’s method for most applications. It combines the reliability of Golden Section with faster convergence from parabolic interpolation, typically requiring 3-5x fewer function evaluations.
Brent’s Method (Recommended)
Brent’s method is the default choice for scalar optimization. It intelligently combines:
- Parabolic interpolation for fast convergence when the function is smooth
- Golden section fallback for guaranteed convergence
Simple Parabola
Let’s minimize $f(x) = (x - 3)^2$, which has its minimum at $x = 3$.
alias Scholar.Optimize.Brent
# Define the objective function
fun = fn x -> Nx.pow(Nx.subtract(x, 3), 2) end
# Find the minimum using Brent's method
result = Brent.minimize(0.0, 5.0, fun)
IO.puts("Minimum found at x = #{Nx.to_number(result.x)}")
IO.puts("Function value: #{Nx.to_number(result.fun)}")
IO.puts("Converged: #{Nx.to_number(result.converged) == 1}")
IO.puts("Iterations: #{Nx.to_number(result.iterations)}")
IO.puts("Function evaluations: #{Nx.to_number(result.fun_evals)}")
Minimizing a Trigonometric Function
Find the minimum of $\sin(x)$ in the interval $[3, 5]$ (around $3\pi/2$):
alias Scholar.Optimize.Brent
fun = fn x -> Nx.sin(x) end
result = Brent.minimize(3.0, 5.0, fun)
expected_x = 3 * :math.pi() / 2 # 3π/2 ≈ 4.712
IO.puts("Minimum found at x = #{Nx.to_number(result.x)}")
IO.puts("Expected: #{expected_x}")
IO.puts("sin(x) at minimum: #{Nx.to_number(result.fun)}")
Shifted Parabola
Minimize $f(x) = (x + 2)^2 + 1$, which has minimum at $x = -2$ with value $1$:
alias Scholar.Optimize.Brent
fun = fn x -> Nx.add(Nx.pow(Nx.add(x, 2), 2), 1) end
result = Brent.minimize(-5.0, 5.0, fun)
IO.puts("Minimum found at x = #{Nx.to_number(result.x)}")
IO.puts("Function value: #{Nx.to_number(result.fun)}")
IO.puts("Expected: x = -2, f(x) = 1")
Brent vs Golden Section: Performance Comparison
Brent’s method typically converges much faster than Golden Section:
alias Scholar.Optimize.{Brent, GoldenSection}
fun = fn x -> Nx.pow(Nx.subtract(x, 3), 2) end
brent_result = Brent.minimize(0.0, 5.0, fun)
golden_result = GoldenSection.minimize(0.0, 5.0, fun)
IO.puts("Simple parabola (x-3)^2 on [0, 5]:")
IO.puts(" Brent: #{Nx.to_number(brent_result.fun_evals)} function evaluations")
IO.puts(" Golden Section: #{Nx.to_number(golden_result.fun_evals)} function evaluations")
IO.puts("")
IO.puts("Both find x ≈ 3.0:")
IO.puts(" Brent x: #{Nx.to_number(brent_result.x)}")
IO.puts(" Golden x: #{Nx.to_number(golden_result.x)}")
Wider Bracket Comparison
The advantage of Brent becomes more pronounced with wider brackets:
alias Scholar.Optimize.{Brent, GoldenSection}
fun = fn x -> Nx.pow(Nx.subtract(x, 50), 2) end
brent_result = Brent.minimize(0.0, 100.0, fun)
golden_result = GoldenSection.minimize(0.0, 100.0, fun)
IO.puts("Parabola (x-50)^2 on [0, 100]:")
IO.puts(" Brent: #{Nx.to_number(brent_result.fun_evals)} function evaluations")
IO.puts(" Golden Section: #{Nx.to_number(golden_result.fun_evals)} function evaluations")
IO.puts("")
IO.puts("Speedup: #{Float.round(Nx.to_number(golden_result.fun_evals) / Nx.to_number(brent_result.fun_evals), 1)}x fewer evaluations")
How Brent’s Method Works
Brent’s method tracks six points during optimization:
-
a, b: Current bracket bounds -
x: Best point (lowest function value) -
w: Second best point -
v: Previous value of w
At each iteration:
- Try parabolic interpolation through x, w, v
- If the parabolic step is acceptable (within bounds, making progress), use it
- Otherwise, take a golden section step into the larger segment
- Update the bracket and tracked points
This gives:
- Superlinear convergence when the function is smooth (parabolic steps)
- Guaranteed convergence when it’s not (golden section fallback)
alias Scholar.Optimize.Brent
fun = fn x -> Nx.pow(Nx.subtract(x, 1), 2) end
result = Brent.minimize(-5.0, 5.0, fun, tol: 1.0e-10, maxiter: 100)
IO.puts("Result from Brent.minimize:")
IO.puts(" x = #{Nx.to_number(result.x)}")
IO.puts(" f(x) = #{Nx.to_number(result.fun)}")
IO.puts(" Iterations: #{Nx.to_number(result.iterations)}")
IO.puts(" Function evaluations: #{Nx.to_number(result.fun_evals)}")
Golden Section Search
Golden Section is included primarily for educational purposes. It uses the golden ratio $\phi = \frac{\sqrt{5} - 1}{2} \approx 0.618$ to select probe points.
Key properties:
- Guaranteed convergence for unimodal functions
- Bracket width decreases by factor $\phi \approx 0.618$ per iteration
- Linear convergence rate
- Requires only function evaluations (no derivatives)
alias Scholar.Optimize.GoldenSection
fun = fn x -> Nx.pow(Nx.subtract(x, 1), 2) end
result = GoldenSection.minimize(-5.0, 5.0, fun, tol: 1.0e-10, maxiter: 100)
IO.puts("Result from GoldenSection.minimize:")
IO.puts(" x = #{Nx.to_number(result.x)}")
IO.puts(" f(x) = #{Nx.to_number(result.fun)}")
IO.puts(" Iterations: #{Nx.to_number(result.iterations)}")
IO.puts(" Function evaluations: #{Nx.to_number(result.fun_evals)}")
GPU/JIT Compatibility
Both methods are implemented as pure defn functions, making them fully compatible with JIT compilation and GPU execution:
alias Scholar.Optimize.Brent
fun = fn x -> Nx.pow(Nx.subtract(x, 3), 2) end
opts = [tol: 1.0e-8, maxiter: 500]
# This works with jit_apply for GPU acceleration
result = Nx.Defn.jit_apply(&Brent.minimize/4, [0.0, 5.0, fun, opts])
IO.puts("JIT-compiled result:")
IO.puts(" x = #{Nx.to_number(result.x)}")
IO.puts(" Converged: #{Nx.to_number(result.converged) == 1}")
Options
Tolerance
Control convergence tolerance:
alias Scholar.Optimize.Brent
fun = fn x -> Nx.pow(Nx.subtract(x, 2), 2) end
# Default tolerance (1.0e-5)
result_default = Brent.minimize(0.0, 5.0, fun)
# Looser tolerance (faster)
result_loose = Brent.minimize(0.0, 5.0, fun, tol: 1.0e-3)
IO.puts("Default tolerance (1e-5): #{Nx.to_number(result_default.fun_evals)} function evals")
IO.puts("Loose tolerance (1e-3): #{Nx.to_number(result_loose.fun_evals)} function evals")
Maximum Iterations
Limit the number of iterations:
alias Scholar.Optimize.Brent
fun = fn x -> Nx.pow(Nx.subtract(x, 2), 2) end
result = Brent.minimize(0.0, 100.0, fun, maxiter: 5)
IO.puts("With maxiter: 5")
IO.puts(" Converged: #{Nx.to_number(result.converged) == 1}")
IO.puts(" Iterations used: #{Nx.to_number(result.iterations)}")
Higher Precision with f64
For higher precision, use f64 tensors for bounds:
alias Scholar.Optimize.Brent
fun = fn x -> Nx.pow(Nx.subtract(x, 3), 2) end
# f32 precision (default)
result_f32 = Brent.minimize(0.0, 5.0, fun)
# f64 precision
a = Nx.tensor(0.0, type: :f64)
b = Nx.tensor(5.0, type: :f64)
result_f64 = Brent.minimize(a, b, fun, tol: 1.0e-12)
IO.puts("f32 result: x = #{Nx.to_number(result_f32.x)}")
IO.puts("f64 result: x = #{Nx.to_number(result_f64.x)}")
IO.puts("f64 error: #{abs(Nx.to_number(result_f64.x) - 3.0)}")
Result Struct
Both Brent.minimize/4 and GoldenSection.minimize/4 return structs with the same fields:
alias Scholar.Optimize.Brent
fun = fn x -> Nx.pow(Nx.subtract(x, 3), 2) end
result = Brent.minimize(0.0, 5.0, fun)
IO.puts("Result struct fields:")
IO.puts(" x: #{Nx.to_number(result.x)} (solution)")
IO.puts(" fun: #{Nx.to_number(result.fun)} (function value at solution)")
IO.puts(" converged: #{Nx.to_number(result.converged)} (1 = yes, 0 = no)")
IO.puts(" iterations: #{Nx.to_number(result.iterations)}")
IO.puts(" fun_evals: #{Nx.to_number(result.fun_evals)}")
When to Use Which Method
| Method | Use Case |
|---|---|
| Brent | Default choice for all scalar optimization |
| Golden Section | Educational purposes, or when you need guaranteed linear convergence behavior |
Both methods:
- Work with any continuous unimodal function
- Require no derivatives
- Are JIT/GPU compatible
- Guarantee convergence within the specified bracket