Divide And Conquer
Mix.install([
{:jason, "~> 1.4"},
{:kino, "~> 0.9", override: true},
{:youtube, github: "brooklinjazz/youtube"},
{:hidden_cell, github: "brooklinjazz/hidden_cell"},
{:benchee, "~> 1.1"}
])
Navigation
Setup
Ensure you type the ea
keyboard shortcut to evaluate all Elixir cells before starting. Alternatively, you can evaluate the Elixir cells as you read.
Merge Sort
Recursion is often associated with divide and conquer algorithms.
Divide and conquer algorithms recursively split the problem into many similar sub-problems.
Let’s take a popular divide and conquer sorting algorithm called merge sort to help understand.
Merge sort allows us to sort a list into ascending order, so [4, 3, 2]
becomes [2, 3, 4]
MergeSort.sort([4, 3, 2])
[2, 3, 4]
Now, how might you implement a sorting function from scratch?
Well, let’s start simple, and work our way up.
Sort An Empty List
How could you sort an empty list? By doing nothing! An empty list is already sorted, the same goes for a list with one element.
[]
[1]
Sort A Two Element List
Now, let’s step up the challenge. How would you sort a list with two elements?
You could compare each value and build a new list like so.
[a, b] = [2, 1]
if a <= b, do: [a, b], else: [b, a]
Sort A Four Element List
How might we sort a list with four elements?
Well, we know how to sort a list with 2 elements. So what if we split the four-element list
[4, 2, 3, 1]
into two lists?
flowchart
a["[4, 3, 2, 1]"] --> b["[4, 3]"]
a --> c["[2, 1]"]
We can use Enum.split/2 with the midpoint of the list (2
) to do that.
Enum.split([4, 3, 2, 1], 2)
Then we already know how to sort those lists.
flowchart
b["[4, 3]"] --> d["[3, 4]"]
c["[2, 1]"] --> e["[1, 2]"]
{[a, b], [c, d]} = Enum.split([4, 3, 2, 1], 2)
list_a = if a <= b, do: [a, b], else: [b, a]
list_b = if c <= d, do: [c, d], else: [d, a]
{list_a, list_b}
Merge Sorted Lists
Now here’s a problem we don’t know the answer to. How do we merge two sorted lists?
Anytime we encounter a problem we don’t know the solution to, it can be wise to consider the simplest case. Sometimes this is called the base case.
Merge Empty Lists
So the simplest lists to merge would be two empty lists, or a list with one element and an empty list.
We’ll start organizing our code in a MergeSort
module.
MergeSort.merge([], [])
[]
MergeSort.merge([1], [])
[1]
MergeSort.merge([], [2])
[2]
In code that might be
defmodule MergeSort do
def merge([], []), do: []
def merge([], list_b), do: list_b
def merge(list_a, []), do: list_a
end
MergeSort.merge([1], [])
Merge Single Element Lists
We want to sort two single element lists.
MergeSort.merge([2], [1])
[1, 2]
Things get trickier, but we can reuse the knowledge of how to sort a list with two elements. We’ve already sorted one two-element list, so separate the head of each list, and compare them.
flowchart TB
A --> B
B --> C
subgraph A[step 0]
direction LR
list_a["[a]"] ---
list_b["[b]"]
end
subgraph B[step 1]
a --- c1[compare] --- b
end
subgraph C[step 2]
c2[compare] --> g[greater than] --> f["[b, a]"]
c2[compare] --> l[less than or equal] --> t["[a, b]"]
end
We return [head_a, head_b]
When the head of list_a is less than or equal to the head of list_b.
Otherwise we return [head_b, head_a]
def merge([head_a | _], [head_b | _]) when head_a <= head_b do
[head_a, head_b]
end
def merge([head_a | _], [head_b | _]) do
[head_b, head_a]
end
Here’s all of that put together into a MergeSort
module.
defmodule MergeSort do
def merge([], []), do: []
def merge([], list_b), do: list_b
def merge(list_a, []), do: list_a
def merge([head_a | _], [head_b | _]) when head_a <= head_b do
[head_a, head_b]
end
def merge([head_a | _], [head_b | _]) do
[head_b, head_a]
end
end
MergeSort.merge([2], [1])
Sort Two Multi-Element Lists
Here’s the fun part, where we involve recursion. Likely this is the most challenging leap. How do we go from sorting two single element lists to multi-element lists?
MergeSort.merge([3, 4], [1, 2])
[1, 2, 3, 4]
Our merge/2
function currently gets us part the way there, it compares the heads of the two
sorted lists and sorts them.
MergeSort.merge([3, 4], [1, 2])
So, what if we took the sorted head, then attempted to merge the remaining elements in the two lists? You can imagine that visualized step by step as:
[a | _] = MergeSort.merge([3, 4], [1, 2])
[b | _] = MergeSort.merge([3, 4], [2])
[c | _] = MergeSort.merge([3, 4], [])
[d | _] = MergeSort.merge([4], [])
[a, b, c, d]
To actually implement this, instead of returning [head_a, head_b]
or [head_b, head_a]
, we
instead, recursively merge the remaining elements.
def merge([head_a | tail_a], list_b) when head_a <= hd(list_b) do
[head_a | merge(tail_a, list_b)]
end
def merge(list_a, [head_b | tail_b]) do
[head_b | merge(list_a, tail_b)]
end
Put together, here’s our new MergeSort
module. It can now merge two pre-sorted lists.
defmodule MergeSort do
def merge([], []), do: []
def merge([], list_b), do: list_b
def merge(list_a, []), do: list_a
def merge([head_a | tail_a], list_b) when head_a <= hd(list_b) do
[head_a | merge(tail_a, list_b)]
end
def merge(list_a, [head_b | tail_b]) do
[head_b | merge(list_a, tail_b)]
end
end
MergeSort.merge([3, 4], [1, 2])
Merge Sort A List
Now that we know how to merge two sorted lists of any size, we’re ready to put everything together
to create the MergeSort.sort/1
function.
MergeSort.sort([4, 2, 3, 1])
[1, 2, 3, 4]
In a divide and conquer algorithm we typically split the collection in half recursively
To split any list in half, we can use the Enum.split/2 function with the midpoint of the list rounded down. To find the midpoint of the list, we can use Enum.count/1 function to find the length and then divide the count by 2. We use integer division to ensure the count is rounded down.
def halve(list) do
length = Enum.count(list)
mid_point = div(length, 2)
Enum.split(list, mid_point)
end
During the sort, we will recursively half the list to create a branching tree of sublists (ensuring that we stop when the list has 1 or 0 elements)
flowchart
A["[4, 2, 3, 1]"]
B1["[4, 2]"]
B2["[3, 1]"]
C1["[4]"]
C2["[2]"]
C3["[3]"]
C4["[1]"]
A --> B1
A --> B2
B1 --> C1
B1 --> C2
B2 --> C3
B2 --> C4
defmodule MergeSort do
def halve(list) do
length = Enum.count(list)
mid_point = div(length, 2)
Enum.split(list, mid_point)
end
def sort([head | []]), do: [head]
def sort(list) do
{list_a, list_b} = halve(list)
left_sorted = sort(list_a)
right_sorted = sort(list_b)
[left_sorted, right_sorted]
end
end
MergeSort.sort([4, 2, 3, 1])
Once we have these sublists, we can then merge them together.
flowchart
A["[1, 2, 3, 4]"]
B1["[2, 4]"]
B2["[1, 3]"]
C1["[4]"]
C2["[2]"]
C3["[3]"]
C4["[1]"]
subgraph 1[left sorted]
C1
end
subgraph 2[right sorted]
C2
end
subgraph 3[left sorted]
C3
end
subgraph 4[right sorted]
C4
end
subgraph 5[left sorted]
B1
end
subgraph 6[right sorted]
B2
end
M1[merge]
M2[merge]
M3[merge]
1 --- M1
2 --- M1
M1 --- 5
3 --- M2
4 --- M2
M2 --- 6
5 --- M3
6 --- M3
M3 --- A
def sort([head | []]), do: [head]
def sort(list) do
{list_a, list_b} = halve(list)
left_sorted = sort(list_a)
right_sorted = sort(list_b)
merge(left_sorted, right_sorted)
end
All put together, we’ve finished our MergeSort
!
defmodule MergeSort do
def merge([], []), do: []
def merge([], list_b), do: list_b
def merge(list_a, []), do: list_a
def merge([head_a | tail_a], list_b) when head_a <= hd(list_b) do
[head_a | merge(tail_a, list_b)]
end
def merge(list_a, [head_b | tail_b]) do
[head_b | merge(list_a, tail_b)]
end
def halve(list) do
length = Enum.count(list)
mid_point = div(length, 2)
Enum.split(list, mid_point)
end
def sort([head | []]), do: [head]
def sort(list) do
{list_a, list_b} = halve(list)
left_sorted = sort(list_a)
right_sorted = sort(list_b)
merge(left_sorted, right_sorted)
end
end
MergeSort.sort([4, 2, 3, 1])
Now truthfully, you won’t often need to write your own sorting algorithms, however, it’s useful to be aware of divide and conquer as a problem-solving tool.
Most of the time, you’ll find built-in solutions or libraries for common problems.
In fact, we must (shamefully) admit that the MergeSort
solution has worse performance than
the built-in Enum.sort
module.
Benchee.run(
%{
"enum" => fn list -> Enum.sort(list) end,
"merge_sort" => fn list -> MergeSort.sort(list) end
},
inputs: %{
"small" => 1..1000 |> Enum.to_list() |> Enum.shuffle(),
"medium" => 1..10_000 |> Enum.to_list() |> Enum.shuffle()
}
)
Your Turn
In the Elixir cell below see if you can improve the performance of the MergeSort
in this new
ModifiedMergeSort
module. Ensure its behavior remains the same.
Hint You might consider removing the assignment of values to variables. These intermediate steps may improve readability, but not performance.
defmodule ModifiedMergeSort do
def merge([], []), do: []
def merge([], list_b), do: list_b
def merge(list_a, []), do: list_a
def merge([head_a | tail_a], list_b) when head_a <= hd(list_b) do
[head_a | merge(tail_a, list_b)]
end
def merge(list_a, [head_b | tail_b]) do
[head_b | merge(list_a, tail_b)]
end
def halve(list) do
length = Enum.count(list)
mid_point = div(length, 2)
Enum.split(list, mid_point)
end
def sort([head | []]), do: [head]
def sort(list) do
{list_a, list_b} = halve(list)
left_sorted = sort(list_a)
right_sorted = sort(list_b)
merge(left_sorted, right_sorted)
end
end
Once you have refactored the above code, you can use this Benchmark to verify your solution is faster or slower.
Benchee.run(
%{
"original" => fn list -> MergeSort.sort(list) end,
"modified" => fn list -> ModifiedMergeSort.sort(list) end
},
inputs: %{
"small" => 1..1000 |> Enum.to_list() |> Enum.shuffle(),
"medium" => 1..10_000 |> Enum.to_list() |> Enum.shuffle(),
"large" => 1..100_000 |> Enum.to_list() |> Enum.shuffle()
}
)
Mark As Completed
file_name = Path.basename(Regex.replace(~r/#.+/, __ENV__.file, ""), ".livemd")
progress_path = __DIR__ <> "/../progress.json"
existing_progress = File.read!(progress_path) |> Jason.decode!()
default = Map.get(existing_progress, file_name, false)
form =
Kino.Control.form(
[
completed: input = Kino.Input.checkbox("Mark As Completed", default: default)
],
report_changes: true
)
Task.async(fn ->
for %{data: %{completed: completed}} <- Kino.Control.stream(form) do
File.write!(progress_path, Jason.encode!(Map.put(existing_progress, file_name, completed)))
end
end)
form
Commit Your Progress
Run the following in your command line from the curriculum folder to track and save your progress in a Git commit.
Ensure that you do not already have undesired or unrelated changes by running git status
or by checking the source control tab in Visual Studio Code.
$ git checkout solutions
$ git checkout -b deprecated-divide-and-conquer-reading
$ git add .
$ git commit -m "finish deprecated divide and conquer reading"
$ git push origin deprecated-divide-and-conquer-reading
Create a pull request from your deprecated-divide-and-conquer-reading
branch to your solutions
branch.
Please do not create a pull request to the DockYard Academy repository as this will spam our PR tracker.
DockYard Academy Students Only:
Notify your instructor by including @BrooklinJazz
in your PR description to get feedback.
You (or your instructor) may merge your PR into your solutions branch after review.
If you are interested in joining the next academy cohort, sign up here to receive more news when it is available.