Benchmarking
Mix.install([
{:jason, "~> 1.4"},
{:kino, "~> 0.9", override: true},
{:youtube, github: "brooklinjazz/youtube"},
{:hidden_cell, github: "brooklinjazz/hidden_cell"},
{:benchee, "~> 1.1"}
])
Navigation
Home Report An Issue PerformanceCustom Enum With RecursionReview Questions
Upon completing this lesson, a student should be able to answer the following questions.
-
How can we benchmark our programs using
:timer
andbenchee
? - Why is it important to benchmark our software, rather than relying on theoretical performance with Big O Notation?
:timer
Out of the box, Erlang provides a :timer
library which contains a tc/1
function to measure the time it takes to run a function.
It returns a {time, result}
tuple.
flowchart LR
Function --> timer[:timer.tc/1] --> return["{timer, result}"]
We’ll use Process.sleep/1 to simulate a slow function. Process.sleep/1 pauses the program’s execution for a number of milliseconds.
:timer.tc(fn -> Process.sleep(1000) end)
The time
returned is measured in microseconds.
1000
microseconds equals 1
millisecond.
1000
milliseconds equals 1
seconds.
Therefore we can divide the number of microseconds by 1000
to get milliseconds,
and the number of milliseconds by 1000
to get seconds.
You’ll notice that :timer
is very close to accurate, but not perfectly accurate. This is
to be expected with very small increments of time.
{microseconds, _result} = :timer.tc(fn -> Process.sleep(1000) end)
milliseconds = microseconds / 1000
seconds = milliseconds / 1000
IO.inspect(microseconds, label: "microseconds")
IO.inspect(milliseconds, label: "milliseconds")
IO.inspect(seconds, label: "seconds")
Your Turn
In the Elixir cell below, use :timer.tc/1
to measure the following slow_function
.
Example Solution
Using anonymous function.
:timer.tc(fn -> slow_function.() end)
Passing the function directly.
:timer.tc(slow_function)
slow_function = fn -> Process.sleep(1000) end
Your Turn
In the Elixir cell below, use :timer.tc/1
to measure the following named function.
Example Solution
Using anonymous function.
:timer.tc(fn -> MyModule.slow_function() end)
Using capture operator.
:timer.tc(&MyModule.slow_function/0)
defmodule MyModule do
def slow_function do
Process.sleep(1000)
end
end
Benchee
Benchee is a popular library for measuring performance and memory consumption.
Installing Benchee
External libraries need to be installed to use them.
We use Mix to install Benchee.
Mix is a build tool that provides tasks for creating, compiling, testing, and managing dependencies for Elixir projects.
flowchart LR
Benchee
D[Dependencies]
I[install/2]
Mix
Mix --> I
I --> D
D --> Benchee
We’ve installed Benchee for you for this lesson, but be aware if you try to use Benchee in other projects it likely isn’t installed.
If you want Benchee in any Livebook, you have to add it to the setup section at the top of the Livebook if it is not already available. Include Benchee in the existing Mix.install/2 call or add one. There should only ever be a single call to Mix.install/2.
Make sure to replace 1.1
with the latest version. You can find the latest version of Benchee on Hex.
Mix.install([
{:jason, "~> 1.4"},
{:kino, "~> 0.9", override: true},
{:benchee, "~> 1.1"},
])
In a mix project, you’ll add the latest Benchee in the existing deps/0
function in mix.exs
.
defp deps do
[
{:benchee, "~> 1.1"}
]
end
Usage
We can use the Benchee.run/2 function to measure the performance and memory consumption of some function.
Benchee.run(%{
"example test" => fn -> 10000 ** 10000 end
})
Above, the most important part of the output should look like the following, but with different numbers.
Name ips average deviation median 99th %
example test 154.65 6.47 ms ±26.50% 6.13 ms 13.17 ms
Memory usage statistics:
Name Memory usage
example test 600 B
The Benchee documentation explains how to read the output:
> average - average execution time/memory usage (the lower the better) > ips - iterations per second, aka how often can the given function be executed within one second (the higher the better - good for graphing), only for run times > deviation - standard deviation (how much do the results vary), given as a percentage of the average (raw absolute values also available) > median - when all measured values are sorted, this is the middle value. More stable than the average and somewhat more likely to be a typical value you see, for the most typical value see mode. (the lower the better) > * 99th % - 99th percentile, 99% of all measured values are less than this - worst case performance-ish
Using Benchee For Comparison
Let’s use Benchee to compare tuples and lists. Tuples are intended to be used as fixed-sized containers that are fast for accessing. Lists are collections intended for dynamic sized containers that get modified.
Therefore, lists should be slow to access, and tuples should be fast to access. That’s in theory, let’s verify our assumption and prove it to be true.
size = 10000
large_list = Enum.to_list(0..size)
large_tuple = List.to_tuple(large_list)
Benchee.run(%{
"list" => fn -> Enum.at(large_list, size) end,
"tuple" => fn -> elem(large_tuple, size) end
})
Upon running the above, you should see an output similar to the following. You’ll notice that accessing a list was far slower!
Name ips average deviation median 99th %
tuple 1.09 M 0.92 μs ±2806.65% 0.80 μs 2 μs
list 0.0324 M 30.89 μs ±93.31% 28.90 μs 67.80 μs
Comparison:
tuple 1.09 M
list 0.0324 M - 33.58x slower +29.97 μs
Your Turn
Use Benchee to compare accessing the first element instead of the last element as we already did above. You should notice the list is faster this time, or at least close to the same speed. You’ll learn why in a future lesson.
Benchee.run(%{
"access first element in list" => fn -> nil end,
"access first element in tuple" => fn -> nil end
})
Multiple Inputs
We can use the :inputs
option in Benchee.run/2 to benchmark multiple inputs.
Benchee.run(
%{
"Enum.count/1" => fn input -> Enum.count(input) end,
"Kernel.length/1" => fn input -> Kernel.length(input) end
},
inputs: [small: [1, 2, 3], medium: Enum.to_list(1..1000), large: Enum.to_list(1..100_000)]
)
You should see a benchmark for each input.
##### With Input Small #####
Name ips average deviation median 99th %
Kernel.length/1 1.34 M 745.94 ns ±3912.97% 500 ns 1600 ns
Enum.count/1 1.28 M 781.60 ns ±3327.02% 500 ns 2000 ns
Comparison:
Kernel.length/1 1.34 M
Enum.count/1 1.28 M - 1.05x slower +35.66 ns
##### With Input Medium #####
Name ips average deviation median 99th %
Kernel.length/1 274.56 K 3.64 μs ±627.53% 3.40 μs 6.30 μs
Enum.count/1 265.48 K 3.77 μs ±1277.99% 3.50 μs 8.20 μs
Comparison:
Kernel.length/1 274.56 K
Enum.count/1 265.48 K - 1.03x slower +0.124 μs
##### With Input Large #####
Name ips average deviation median 99th %
Kernel.length/1 5.30 K 188.70 μs ±63.77% 157.70 μs 897.38 μs
Enum.count/1 4.97 K 201.40 μs ±75.41% 159.50 μs 1096.38 μs
Comparison:
Kernel.length/1 5.30 K
Enum.count/1 4.97 K - 1.07x slower +12.70 μs
This is especially useful for benchmarking how larger data affects our system. It’s also useful for creating different situations (i.e. worst case and best case scenarios) that may cause our system to behave differently than expected.
How To Make A Good Benchmark
It’s easy for a benchmark to misrepresent the actual performance of our system.
To make our benchmarks more representative, consider the following metrics:
- variety: Use a variety of inputs in your benchmark.
- size: A system may be very fast with small amounts of data, but incredibly slow with large amounts of data. Consider using multiple different samples of data with different sizes to test where the system hits its limits.
- consistency: Consider running your benchmark multiple times. Does it’s performance change frequently due to edge cases?
In general, you can use benchmarks to optimize your systems and see how they perform under pressure. Consider exactly how you want to stress your system to get benchmark metrics that make the most sense for your situation.
We also want to consider what we’re measuring. For example, if you read the average
time instead of the median
, it’s easy to have outliers skew our data.
For example:
> Ben Hill Griffin Stadium, the stadium at the University of Florida, holds 88,548
people. In Florida the minimum wage is $11.00
per hour. Working an eight hour day for five days a week fifty-two weeks a year this comes out to $22,880
a year. If every seat in the stadium was filled by someone making minimum wage, the average annual income would be $22,880
. and the total worth would be just over two billion $2,025,978,240
In 2021 Jeff Bezos made ~ 197 Billion – $197,000,000,000
. If Jeff Bezos enters the stadium the total annual income is $199,025,978,240
and the AVERAGE income would be about 2.24 million – $2,247,636
. One large outlier can really skew the average. This is often why we use the median, or middle value, of a set. It’s less skewed by large outliers.
>
> Example provided by Stephen Fowler
Commit Your Progress
DockYard Academy now recommends you use the latest Release rather than forking or cloning our repository.
Run git status
to ensure there are no undesirable changes.
Then run the following in your command line from the curriculum
folder to commit your progress.
$ git add .
$ git commit -m "finish Benchmarking reading"
$ git push
We’re proud to offer our open-source curriculum free of charge for anyone to learn from at their own pace.
We also offer a paid course where you can learn from an instructor alongside a cohort of your peers. We will accept applications for the June-August 2023 cohort soon.