Maps, Mapsets, and Keyword Lists
Mix.install([
{:jason, "~> 1.4"},
{:kino, "~> 0.8.0", override: true},
{:youtube, github: "brooklinjazz/youtube"},
{:hidden_cell, github: "brooklinjazz/hidden_cell"}
])
Navigation
Setup
Ensure you type the ea keyboard shortcut to evaluate all Elixir cells before starting. Alternatively you can evaluate the Elixir cells as you read.
Review Questions
Upon completing this lesson, a student should be able to answer the following questions.
- When should you use a MapSet vs a list?
- What are the performance impacts of MapSets, Maps, and Keyword Lists?
Overview
In this lesson, we will study the internal details of Maps, MapSets, and Keyword Lists to determine when to use each data type.
Keyword Lists
Keyword lists have the same properties as Lists.
| Operation | Time Complexity |
|---|---|
| Access | $O(n)$ |
| Search | $O(n)$ |
| Insertion | $O(1)$ for prepending, otherwise $O(n)$ |
| Deletion | $O(1)$ if first element, otherwise $O(n)$ |
Because they are simply lists of {:atom, value} Tuples.
[{:key1, "value"}, {:key2, "value"}] == [key1: "value", key2: "value"]
Maps
Maps are a key-value data type that typically has $O(log(n))$ time for any operation involving accessing a key.
| Operation | Time Complexity |
|---|---|
| Access | $O(log(n))$ |
| Search | $O(log(n))$ |
| Insertion | $O(n)$ for <= 32 elements, otherwise $O(log(n))$ |
| Deletion | $O(n)$ for <= 32 elements, otherwise $O(log(n))$ |
Maps with 32 or fewer keys are sorted lists. Therefore they have $O(n)$ performance, but this effect is negligible.
Notice that the map below retains its sort order.
map = Map.new(1..32, fn x -> {x, "#{x}"} end)
Under the hood, a map with 32 or fewer keys uses a sorted list.
Generally, we should avoid relying on this property. When a map grows beyond 32 keys, it loses its sorting order.
Notice the map below is no longer ordered
now that it has 33 keys.
map = Map.new(1..33, fn x -> {x, "#{x}"} end)
Maps in Elixir are Hash Array Mapped Trie (HAMT) structures.
We won’t dive deeply into the implementation of this data structure as it is beyond the scope of this course.
For practical purposes, be aware most operations on a map will have $O(log(n))$ complexity.
Maps scale better for large values when compared with lists or keyword lists, which grow linearly instead of logarithmically.
Your Turn
Accessing any key in a map with a million elements is nearly instant, regardless of which key is accessed!
First, we’ll create the large map. This may take a moment.
large_map = Map.new(1..10_000_000, fn x -> {x, x} end)
While creating the map took some time, reading a value should be extremely fast.
Evaluate the cell below and notice how quickly we can access a value. Try changing key to any integer between 0 and 999_999. Notice there is no significant affect on the speed of accessing a value.
key = 500_000
{micro_seconds, _result} = :timer.tc(fn -> Map.get(large_map, key) end)
micro_seconds
On the other hand, Lists are slow compared to Maps because they need to enumerate each element.
large_list = Enum.to_list(1..1_000_000)
Try changing index to any integer between 0 and 999_999 and notice that the larger the index value, the slower it takes to access an element.
index = 500_000
{micro_seconds, _result} = :timer.tc(fn -> Enum.at(large_list, index) end)
micro_seconds
MapSets
MapSets are a set data structure. They function like a list that only allows non-duplicate values.
MapSet.new([1, 2, 3])
MapSets ignore duplicate values.
MapSet.new([1, 2, 3, 3])
MapSets are Maps under the hood, so they inherit the same performance properties as Maps.
| Operation | Time Complexity |
|---|---|
| Access | $O(log(n))$ |
| Search | $O(log(n))$ |
| Insertion | $O(n)$ for <= 32 elements, otherwise $O(log(n))$ |
| Deletion | $O(n)$ for <= 32 elements, otherwise $O(log(n))$ |
The MapSet module contains functions for working with MapSets.
Here’s a few to get you started:
-
new/1create a new mapset. -
put/2add an element into a mapset. -
delete/2remove an element from a mapset.
A MapSet can contain any value. Keep in mind that order is not guaranteed.
MapSet.new(["1", 2, :three])
put/2 adds any non-duplicate element to a mapset.
set = MapSet.new([1, 2, 3])
MapSet.put(set, 4)
Duplicates will simply be ignored.
set = MapSet.new([1, 2, 3])
MapSet.put(set, 2)
delete/2 removes an element from a MapSet.
set = MapSet.new([1, 2, 3])
MapSet.delete(set, 2)
MapSet is enumerable. However, any Enum function with a MapSet returns a list.
MapSet.new([1, 2, 3]) |> Enum.map(fn each -> each + 1 end)
Generally, Lists are overused and can often be a MapSet instead. If you have a list where there should never be duplicate elements and order is not important, it’s usually more performant to use a MapSet.
Your Turn
Create a MapSet with the elements 1, 2, 3. Use MapSet.member?/2 to check if your MapSet contains
the element 2. Your answer should return true.
Mark As Completed
file_name = Path.basename(Regex.replace(~r/#.+/, __ENV__.file, ""), ".livemd")
save_name =
case Path.basename(__DIR__) do
"reading" -> "maps_mapsets_keyword_lists_reading"
"exercises" -> "maps_mapsets_keyword_lists_exercise"
end
progress_path = __DIR__ <> "/../progress.json"
existing_progress = File.read!(progress_path) |> Jason.decode!()
default = Map.get(existing_progress, save_name, false)
form =
Kino.Control.form(
[
completed: input = Kino.Input.checkbox("Mark As Completed", default: default)
],
report_changes: true
)
Task.async(fn ->
for %{data: %{completed: completed}} <- Kino.Control.stream(form) do
File.write!(
progress_path,
Jason.encode!(Map.put(existing_progress, save_name, completed), pretty: true)
)
end
end)
form
Commit Your Progress
Run the following in your command line from the curriculum folder to track and save your progress in a Git commit.
Ensure that you do not already have undesired or unrelated changes by running git status or by checking the source control tab in Visual Studio Code.
$ git checkout -b maps-mapsets-keyword-lists-reading
$ git add .
$ git commit -m "finish maps mapsets keyword lists reading"
$ git push origin maps-mapsets-keyword-lists-reading
Create a pull request from your maps-mapsets-keyword-lists-reading branch to your solutions branch.
Please do not create a pull request to the DockYard Academy repository as this will spam our PR tracker.
DockYard Academy Students Only:
Notify your teacher by including @BrooklinJazz in your PR description to get feedback.
You (or your teacher) may merge your PR into your solutions branch after review.
If you are interested in joining the next academy cohort, sign up here to receive more news when it is available.
Up Next
| Previous | Next |
|---|---|
| Lists and Tuples | Drills: MapSets |