Task
Mix.install([
{:kino, github: "livebook-dev/kino", override: true},
{:kino_lab, "~> 0.1.0-dev", github: "jonatanklosko/kino_lab"},
{:vega_lite, "~> 0.1.4"},
{:kino_vega_lite, "~> 0.1.1"},
{:benchee, "~> 0.1"},
{:ecto, "~> 3.7"},
{:math, "~> 0.7.0"},
{:faker, "~> 0.17.0"},
{:utils, path: "#{__DIR__}/../utils"}
])
Navigation
Setup
Ensure you type the ea
keyboard shortcut to evaluate all Elixir cells before starting. Alternatively you can evaluate the Elixir cells as you read.
Task
Rather than use spawn/1
or spawn_link/1
when we want to create a process, we often instead rely
on the Task module which provides conveniences for spawning and awaiting processes.
Starting A Process
Similar to spawn
we can use Task.start/1
to create a new short-lived process that dies when it’s function executes.
# using spawn
spawn_pid = spawn(fn -> IO.puts("spawn ran!") end)
Process.sleep(100)
Process.alive?(spawn_pid) || IO.puts("spawn is dead")
# using task
{:ok, task_pid} = Task.start(fn -> IO.puts("task ran!") end)
Process.sleep(100)
Process.alive?(task_pid) || IO.puts("task is dead")
Awaiting Processes
We can run two computations concurrently by separating them into two different processes.
A computer with a multi-core processor can perform these computations in parallel, which may make our program faster. In reality, it’s a bit more complicated than this, but this is a reasonable mental model for now to understand why concurrency is useful for improving performance.
flowchart LR
Par[Parent Process]
CP1[Computation 1]
CP2[Computation 2]
P1[Process 1]
P2[Process 1]
Par --creates--> P1
Par --creates--> P2
P1 --performs --> CP1
P2 --performs --> CP2
CP1 --sends result to --> Par
CP2 --sends result to --> Par
In order to spawn/1
a process and retrieve a value (send the parent a message), we need a fair bit of boilerplate code.
parent = self()
spawn(fn ->
# simulating expensive calculation
Process.sleep(1000)
send(parent, {:msg, "hello"})
end)
receive do
{:msg, value} -> value
end
With Task
, we can use async/1
and await/1
to spawn a process, perform some calculation,
and the retrieve the value when it’s finished.
task =
Task.async(fn ->
# simulating expensive calculation
Process.sleep(1000)
{:msg, "hello"}
end)
Task.await(task)
Keep in mind that Task.async/1
returns a Task
struct, not a pid. The Task
struct
contains information about who the parent (:owner
) process is, the task’s pid (:pid
), and a reference (:ref
) used
to monitor if the task crashes.
Task.async(fn -> nil end)
To demonstrate the performance value of concurrency, let’s say we have two computations which each take 1
second, it would normally take us 2
seconds
to run these tasks synchronously.
computation1 = fn -> Process.sleep(1000) end
computation2 = fn -> Process.sleep(1000) end
{microseconds, _result} =
:timer.tc(fn ->
computation1.()
computation2.()
end)
# expected to be ~2 seconds
microseconds / 1000 / 1000
By running these computations in parallel, we can theoretically reduce this time to 1
second instead of 2
.
> Note, if your computer does not have multiple cores, then it will still take 2
seconds rather than the expected 1
second.
computation1 = fn -> Process.sleep(1000) end
computation2 = fn -> Process.sleep(1000) end
{microseconds, _result} =
:timer.tc(fn ->
task1 = Task.async(fn -> computation1.() end)
task2 = Task.async(fn -> computation2.() end)
Task.await(task1)
Task.await(task2)
end)
# expected to be ~1 second
microseconds / 1000 / 1000
Your Turn
Use Task.async/1
and Task.await/1
to demonstrate the performance benefits between synchronous
execution and parallel execution.
You may consider using Process.sleep/1
to simulate an expensive computation.
Awaiting Many Tasks
When working with many parallel tasks, we can use enumeration to spawn many tasks.
tasks =
Enum.map(1..5, fn each ->
Task.async(fn ->
Process.sleep(1000)
each * 2
end)
end)
Then we can also use enumeration to await/1
each task.
Enum.map(tasks, fn task -> Task.await(task) end)
You may also choose to use the await_many/1
convenience function provided by the Task
module.
tasks =
Enum.map(1..5, fn each ->
Task.async(fn ->
Process.sleep(1000)
each * 2
end)
end)
Task.await_many(tasks)
Timeouts
Task.await/1
pauses the current execution to wait until a task has finished. However, it
will not wait forever. By default, Task.await/1
and Task.await_many/1
will wait for five seconds for
the task to complete. If the task does not finish, it will raise an error.
task = Task.async(fn -> Process.sleep(6000) end)
Task.await(task)
If we want to wait for more or less time, we can override the default value. await/2
and await_many/2
accept
a timeout value as the second argument to the function.
task = Task.async(fn -> Process.sleep(6000) end)
Task.await(task, 7000)
task1 = Task.async(fn -> Process.sleep(6000) end)
task2 = Task.async(fn -> Process.sleep(6000) end)
Task.await_many([task1, task2], 7000)
Your Turn
In the Elixir cell below, spawn a task which takes one second to complete.
await/2
the task and alter the timeout value to be less than one second. Awaiting the task should crash.
Further Reading
For more on Task
, consider reading:
- The Task HexDocs
Commit Your Progress
Run the following in your command line from the project folder to track and save your progress in a Git commit.
$ git add .
$ git commit -m "finish task section"