Introduction to Supervisors
Navigation
Home The registry moduleSupervision strategiesWhat is a Supervisor?
In our previous lesson on OTP, we explored the Genserver behavior. However, there’s another critical behavior in OTP that deserves our attention: the supervisor.
Supervisors fulfill the role of overseeing other processes, often referred to as child processes, and contribute to the creation of a hierarchical process structure called a supervision tree. This tree not only ensures fault-tolerance but also governs the application’s startup and shutdown processes.
Supervisors are the driving force behind the Elixir developer’s inclination towards embracing the “let it crash” or “fail fast” philosophy. This approach allows supervisors to automatically restart crashed processes, facilitating a more robust system.
To better understand how supervisors work, let’s examine a simple example. We’ll create a Stack Genserver module with a bug that causes it to crash when attempting to pop an element from an empty stack. Since our Genserver is supervised, we can observe how the supervisor automatically restarts the failed Genserver process when it crashes.
defmodule Stack do
use GenServer
def start_link(%{initial_value: value, name: name}) do
GenServer.start_link(__MODULE__, value, name: name)
end
## Callbacks
@impl true
def init(arg) do
IO.puts("Stack GenServer starting up!")
{:ok, [arg]}
end
@impl true
def handle_call({:push, element}, _from, stack) do
IO.puts("Pushed #{inspect(element)}")
{:reply, :pushed, [element | stack]}
end
@impl true
def handle_cast(:pop, [popped | stack]) do
IO.puts("Popped #{inspect(popped)}")
{:noreply, stack}
end
end
children = [
%{
id: :stack_1,
# The Stack is a child porcess started via Stack.start_link/1
start: {Stack, :start_link, [%{initial_value: 0, name: :stack_1}]}
}
]
# Now we start the supervisor process and pass it the list of child specs (child processes to supervise)
# On starting the supervisor, it automatically starts all the child processes and supervises them
{:ok, supervisor_pid} = Supervisor.start_link(children, strategy: :one_for_one)
# After the supervisor starts, we can query the supervisor for information regarding all child processes supervised under it
Supervisor.which_children(supervisor_pid) |> IO.inspect(label: "Supervisor's children")
Now lets see what happens if our Stack Genserver process crashes
GenServer.whereis(:stack_1) |> IO.inspect(label: "Stack Genserver Process pid")
:sys.get_state(GenServer.whereis(:stack_1))
|> IO.inspect(label: "Intial Genserver state")
GenServer.call(:stack_1, {:push, 10})
GenServer.call(:stack_1, {:push, 20})
GenServer.cast(:stack_1, :pop)
GenServer.cast(:stack_1, :pop)
GenServer.cast(:stack_1, :pop)
:sys.get_state(GenServer.whereis(:stack_1))
|> IO.inspect(label: "Genserver state just before crash")
# Boom! Stack genserver crashes..
GenServer.cast(:stack_1, :pop)
# wait for the supervisor to restart the Stack Server process
Process.sleep(200)
GenServer.whereis(:stack_1) |> IO.inspect(label: "Restarted stack Genserver Process pid")
:sys.get_state(GenServer.whereis(:stack_1)) |> IO.inspect(label: "Genserver state after crash")
Child Specs
When starting a supervisor, we have the option to provide a list of child specifications that dictate how the supervisor should handle starting, stopping, and restarting each child process.
A supervisor can supervise two types of processes: workers and other supervisor processes. The former is commonly known as a worker
, while the latter is referred to as a supervisor
, typically forming a supervision tree.
A child specification is represented as a map with up to six elements. The first two elements are mandatory, while the remaining ones are optional.
Lets go through the different options that we can specify in the supervisor child spec
-
:id
- This key is required and serves as an internal identifier used by the supervisor to identify the child specification. It should be unique among the workers within the same supervisor. -
:start
- This key is required and contains a tuple specifying the module, function, and arguments used to start the child process. -
:restart
- This optional key, defaulted to:permanent
, is an atom that determines when a terminated child process should be restarted. -
:shutdown
- This optional key, defaulted to 5_000 (5 seconds) for workers and:infinity
for supervisors, specifies how a child process should be terminated, either by an integer representing a timeout or the atom:infinity
. -
:type
- This optional key, defaulted to:worker
, specifies whether the child process is a:worker
or a:supervisor
. -
:modules
- This optional key contains a list of modules used by hot code upgrade mechanisms to identify processes using specific modules.
A child specification can be defined in one of three ways:
-
As a map representing the child specification itself.
children = [ %{ id: :stack_1, start: {Stack, :start_link, [%{initial_value: 0, name: :stack_1}]} } ]
The above example defines a child with
:id
of:stack_1
, which is started by invokingStack.start_link(%{initial_value: 0, name: :stack_1})
. -
As a tuple with the module name as the first element and the start argument as the second.
children = [ {Stack, %{initial_value: 0, name: :stack_1}} ]
When using this shorthand notation, the supervisor calls
Stack.child_spec(%{initial_value: 0, name: :stack_1})
to retrieve the child specification. TheStack
module is responsible for defining its ownchild_spec/1
function.The
Stack
module can define its child specification as follows:def child_spec(arg) do %{ id: Stack, start: {Stack, :start_link, [arg]} } end
In this case, since
GenServer
already definesStack.child_spec/1
, we can leverage the automatically generatedchild_spec/1
function and customize it by passing options directly touse GenServer
. We will see examples of this in later chapters -
Alternatively, a child specification can be specified by providing only the module name.
children = [Stack]
This is equivalent to
{Stack, []}
. However, in our case, it would be invalid sinceStack.start_link/1
requires an initial value, and passing an empty list wouldn’t work.
The Supervisor.child_spec/2
function
When using the shorthand notations mentioned above, such as the {module, arg}
tuple or a module name only as a child specification, we can modify the generated child specifications using the Supervisor.child_spec/2
function.
-
When a two-element tuple of the form
{module, arg}
is provided, the child specification is retrieved by callingmodule.child_spec(arg)
. -
When only a module is given, the child specification is retrieved by calling
module.child_spec([])
.
After retrieving the child specification, any overrides specified in the function argument are applied directly to the child spec.
For example, we can use the shorthand notation {Stack, %{initial_value: 0, name: :stack_1}}
, but this would set id: Stack
as the child’s identifier since it is the default behavior of module.child_spec(arg)
. However, we can override this behavior as shown below:
children = [
Supervisor.child_spec({Stack, %{initial_value: 0, name: :stack_1}}, id: :special_stack)
]
Resources
- https://hexdocs.pm/elixir/1.14.4/Supervisor.html
- https://elixir-lang.org/getting-started/mix-otp/supervisor-and-application.html