Oban Training—Introduction
The Pitch
Oban is a background job system built on modern PostgreSQL and SQLite3 with the primary goals of reliability, consistency and observability.
Why?
What are background jobs?
Computation that is performed asynchronously because it is slow, resource intensive, fragile, or must be resilient to errors.
Sure. But this is Elixir, why do we need something else to run background jobs?
- We have Task.async or even Task.Supervisor.async_nolink?!
- How do you control the number of jobs running at once?
- How do you recover from errors and retry?
- What if the node shuts down while your task is running?
- What if the job needs to run in a minute, or an hour?
- How can you see what jobs are running?
- To coordinate all those technicalities and requirements we need a system.
Oban is a background job system
- There were (and are) other job processors for Elixir
- Some are in memory, some in mnesia, others in redis or rabbit mq, and even some built on postgres
- Those systems handle some of the requirements we just talked about (retrying for failures, scheduling in the future)
- They treat jobs as ephemeral data—once a job runs, it disappears forever
- Oban is different because it treats jobs as persistent data
Guiding principals
- Persistent — Retaining jobs between restarts
- Reliable — Never lose a job
- Observable — Expose system activity at all levels
Secondary goals
- Distributed execution and horizontal scaling
- Deep modules with simple interfaces
- Batteries included while staying extensible
Glossary
Node
A “node” is a BEAM host for one or more Oban instances. Nodes don’t need to be clustered, but they must have unique names and connect to the same PostgreSQL database.
Instance
An Oban supervision tree is called an “instance,” and applications can have multiple instances as long as they have a unique name (e.g. Oban.A, Oban.B, Oban.C)
Queue
Oban segments jobs into named “queues”, each of which runs a configurable number of concurrent jobs. Every queue’s jobs live in the same database table.
Job
An Oban “job” wraps up the queue name, worker, arguments, state, and other options into a serializable struct (and Ecto schema) persisted as rows in the oban_jobs
table.
Worker
A “worker” module performs one-off tasks called “jobs”. The worker’s perform/1
function receives a job with arguments and executes an application’s business logic.
defmodule MyApp.OnboardWorker do
use Oban.Worker
@impl Oban.Worker
def perform(%{args: %{"user_id" => user_id}}) do
user_id
|> MyApp.fetch_user()
|> MyApp.onboard_user()
end
end
States
Jobs flow through “states” indicating their place in a finite state machine. They start inserted or scheduled, transition to executing, then to an end state, or back to retryable if there are retries available.
That’s all the terminology for now. On to the exercises!