Powered by AppSignal & Oban Pro
Would you like to see your link here? Contact us
Notesclub

Divide And Conquer

deprecated_divide_and_conquer.livemd

Divide And Conquer

Mix.install([
  {:jason, "~> 1.4"},
  {:kino, "~> 0.9", override: true},
  {:youtube, github: "brooklinjazz/youtube"},
  {:hidden_cell, github: "brooklinjazz/hidden_cell"},
  {:benchee, "~> 1.1"}
])

Navigation

Return Home Report An Issue

Setup

Ensure you type the ea keyboard shortcut to evaluate all Elixir cells before starting. Alternatively, you can evaluate the Elixir cells as you read.

Merge Sort

Recursion is often associated with divide and conquer algorithms.

Divide and conquer algorithms recursively split the problem into many similar sub-problems.

Let’s take a popular divide and conquer sorting algorithm called merge sort to help understand.

Merge sort allows us to sort a list into ascending order, so [4, 3, 2] becomes [2, 3, 4]

MergeSort.sort([4, 3, 2])
[2, 3, 4]

Now, how might you implement a sorting function from scratch?

Well, let’s start simple, and work our way up.

Sort An Empty List

How could you sort an empty list? By doing nothing! An empty list is already sorted, the same goes for a list with one element.

[]
[1]

Sort A Two Element List

Now, let’s step up the challenge. How would you sort a list with two elements?

You could compare each value and build a new list like so.

[a, b] = [2, 1]

if a <= b, do: [a, b], else: [b, a]

Sort A Four Element List

How might we sort a list with four elements?

Well, we know how to sort a list with 2 elements. So what if we split the four-element list [4, 2, 3, 1] into two lists?

flowchart

 a["[4, 3, 2, 1]"] -->  b["[4, 3]"]
 a -->  c["[2, 1]"]

We can use Enum.split/2 with the midpoint of the list (2) to do that.

Enum.split([4, 3, 2, 1], 2)

Then we already know how to sort those lists.

flowchart

 b["[4, 3]"]  --> d["[3, 4]"]
 c["[2, 1]"] --> e["[1, 2]"]
{[a, b], [c, d]} = Enum.split([4, 3, 2, 1], 2)

list_a = if a <= b, do: [a, b], else: [b, a]
list_b = if c <= d, do: [c, d], else: [d, a]

{list_a, list_b}

Merge Sorted Lists

Now here’s a problem we don’t know the answer to. How do we merge two sorted lists?

Anytime we encounter a problem we don’t know the solution to, it can be wise to consider the simplest case. Sometimes this is called the base case.

Merge Empty Lists

So the simplest lists to merge would be two empty lists, or a list with one element and an empty list. We’ll start organizing our code in a MergeSort module.

MergeSort.merge([], [])
[]

MergeSort.merge([1], [])
[1]

MergeSort.merge([], [2])
[2]

In code that might be

defmodule MergeSort do
  def merge([], []), do: []
  def merge([], list_b), do: list_b
  def merge(list_a, []), do: list_a
end

MergeSort.merge([1], [])

Merge Single Element Lists

We want to sort two single element lists.

MergeSort.merge([2], [1])
[1, 2]

Things get trickier, but we can reuse the knowledge of how to sort a list with two elements. We’ve already sorted one two-element list, so separate the head of each list, and compare them.

flowchart TB
  A --> B
  B --> C
  subgraph A[step 0]
    direction LR
    list_a["[a]"] ---
    list_b["[b]"]
  end

  subgraph B[step 1]
  a --- c1[compare] --- b
  end

  subgraph C[step 2]
    c2[compare] --> g[greater than] --> f["[b, a]"]
    c2[compare] --> l[less than or equal] --> t["[a, b]"]

  end

We return [head_a, head_b] When the head of list_a is less than or equal to the head of list_b. Otherwise we return [head_b, head_a]

def merge([head_a | _], [head_b | _]) when head_a <= head_b do
  [head_a, head_b]
end

def merge([head_a | _], [head_b | _]) do
  [head_b, head_a]
end

Here’s all of that put together into a MergeSort module.

defmodule MergeSort do
  def merge([], []), do: []
  def merge([], list_b), do: list_b
  def merge(list_a, []), do: list_a

  def merge([head_a | _], [head_b | _]) when head_a <= head_b do
    [head_a, head_b]
  end

  def merge([head_a | _], [head_b | _]) do
    [head_b, head_a]
  end
end

MergeSort.merge([2], [1])

Sort Two Multi-Element Lists

Here’s the fun part, where we involve recursion. Likely this is the most challenging leap. How do we go from sorting two single element lists to multi-element lists?

MergeSort.merge([3, 4], [1, 2])
[1, 2, 3, 4]

Our merge/2 function currently gets us part the way there, it compares the heads of the two sorted lists and sorts them.

MergeSort.merge([3, 4], [1, 2])

So, what if we took the sorted head, then attempted to merge the remaining elements in the two lists? You can imagine that visualized step by step as:

[a | _] = MergeSort.merge([3, 4], [1, 2])
[b | _] = MergeSort.merge([3, 4], [2])
[c | _] = MergeSort.merge([3, 4], [])
[d | _] = MergeSort.merge([4], [])

[a, b, c, d]

To actually implement this, instead of returning [head_a, head_b] or [head_b, head_a], we instead, recursively merge the remaining elements.

def merge([head_a | tail_a], list_b) when head_a <= hd(list_b) do
  [head_a | merge(tail_a, list_b)]
end

def merge(list_a, [head_b | tail_b]) do
  [head_b | merge(list_a, tail_b)]
end

Put together, here’s our new MergeSort module. It can now merge two pre-sorted lists.

defmodule MergeSort do
  def merge([], []), do: []
  def merge([], list_b), do: list_b
  def merge(list_a, []), do: list_a

  def merge([head_a | tail_a], list_b) when head_a <= hd(list_b) do
    [head_a | merge(tail_a, list_b)]
  end

  def merge(list_a, [head_b | tail_b]) do
    [head_b | merge(list_a, tail_b)]
  end
end

MergeSort.merge([3, 4], [1, 2])

Merge Sort A List

Now that we know how to merge two sorted lists of any size, we’re ready to put everything together to create the MergeSort.sort/1 function.

MergeSort.sort([4, 2, 3, 1])
[1, 2, 3, 4]

In a divide and conquer algorithm we typically split the collection in half recursively

To split any list in half, we can use the Enum.split/2 function with the midpoint of the list rounded down. To find the midpoint of the list, we can use Enum.count/1 function to find the length and then divide the count by 2. We use integer division to ensure the count is rounded down.

def halve(list) do
    length = Enum.count(list)
    mid_point = div(length, 2)
    Enum.split(list, mid_point)
end

During the sort, we will recursively half the list to create a branching tree of sublists (ensuring that we stop when the list has 1 or 0 elements)

flowchart
  A["[4, 2, 3, 1]"]
  B1["[4, 2]"]
  B2["[3, 1]"]
  C1["[4]"]
  C2["[2]"]
  C3["[3]"]
  C4["[1]"]
  A --> B1
  A --> B2
  B1 --> C1
  B1 --> C2
  B2 --> C3
  B2 --> C4
defmodule MergeSort do
  def halve(list) do
    length = Enum.count(list)
    mid_point = div(length, 2)
    Enum.split(list, mid_point)
  end

  def sort([head | []]), do: [head]

  def sort(list) do
    {list_a, list_b} = halve(list)
    left_sorted = sort(list_a)
    right_sorted = sort(list_b)
    [left_sorted, right_sorted]
  end
end

MergeSort.sort([4, 2, 3, 1])

Once we have these sublists, we can then merge them together.

flowchart
  A["[1, 2, 3, 4]"]
  B1["[2, 4]"]
  B2["[1, 3]"]
  C1["[4]"]
  C2["[2]"]
  C3["[3]"]
  C4["[1]"]
  subgraph 1[left sorted]
    C1
  end
  subgraph 2[right sorted]
    C2
  end
  subgraph 3[left sorted]
    C3
  end
  subgraph 4[right sorted]
    C4
  end
  subgraph 5[left sorted]
    B1
  end
  subgraph 6[right sorted]
    B2
  end
  M1[merge]
  M2[merge]
  M3[merge]
  1 --- M1
  2 --- M1
  M1 --- 5
  3 --- M2
  4 --- M2
  M2 --- 6
  5 --- M3
  6 --- M3
  M3 --- A
def sort([head | []]), do: [head]
def sort(list) do
    {list_a, list_b} = halve(list)
    left_sorted = sort(list_a)
    right_sorted = sort(list_b)
    merge(left_sorted, right_sorted)
end

All put together, we’ve finished our MergeSort!

defmodule MergeSort do
  def merge([], []), do: []
  def merge([], list_b), do: list_b
  def merge(list_a, []), do: list_a

  def merge([head_a | tail_a], list_b) when head_a <= hd(list_b) do
    [head_a | merge(tail_a, list_b)]
  end

  def merge(list_a, [head_b | tail_b]) do
    [head_b | merge(list_a, tail_b)]
  end

  def halve(list) do
    length = Enum.count(list)
    mid_point = div(length, 2)
    Enum.split(list, mid_point)
  end

  def sort([head | []]), do: [head]

  def sort(list) do
    {list_a, list_b} = halve(list)
    left_sorted = sort(list_a)
    right_sorted = sort(list_b)
    merge(left_sorted, right_sorted)
  end
end

MergeSort.sort([4, 2, 3, 1])

Now truthfully, you won’t often need to write your own sorting algorithms, however, it’s useful to be aware of divide and conquer as a problem-solving tool.

Most of the time, you’ll find built-in solutions or libraries for common problems. In fact, we must (shamefully) admit that the MergeSort solution has worse performance than the built-in Enum.sort module.

Benchee.run(
  %{
    "enum" => fn list -> Enum.sort(list) end,
    "merge_sort" => fn list -> MergeSort.sort(list) end
  },
  inputs: %{
    "small" => 1..1000 |> Enum.to_list() |> Enum.shuffle(),
    "medium" => 1..10_000 |> Enum.to_list() |> Enum.shuffle()
  }
)

Your Turn

In the Elixir cell below see if you can improve the performance of the MergeSort in this new ModifiedMergeSort module. Ensure its behavior remains the same.

Hint You might consider removing the assignment of values to variables. These intermediate steps may improve readability, but not performance.

defmodule ModifiedMergeSort do
  def merge([], []), do: []
  def merge([], list_b), do: list_b
  def merge(list_a, []), do: list_a

  def merge([head_a | tail_a], list_b) when head_a <= hd(list_b) do
    [head_a | merge(tail_a, list_b)]
  end

  def merge(list_a, [head_b | tail_b]) do
    [head_b | merge(list_a, tail_b)]
  end

  def halve(list) do
    length = Enum.count(list)
    mid_point = div(length, 2)
    Enum.split(list, mid_point)
  end

  def sort([head | []]), do: [head]

  def sort(list) do
    {list_a, list_b} = halve(list)
    left_sorted = sort(list_a)
    right_sorted = sort(list_b)
    merge(left_sorted, right_sorted)
  end
end

Once you have refactored the above code, you can use this Benchmark to verify your solution is faster or slower.

Benchee.run(
  %{
    "original" => fn list -> MergeSort.sort(list) end,
    "modified" => fn list -> ModifiedMergeSort.sort(list) end
  },
  inputs: %{
    "small" => 1..1000 |> Enum.to_list() |> Enum.shuffle(),
    "medium" => 1..10_000 |> Enum.to_list() |> Enum.shuffle(),
    "large" => 1..100_000 |> Enum.to_list() |> Enum.shuffle()
  }
)

Mark As Completed

file_name = Path.basename(Regex.replace(~r/#.+/, __ENV__.file, ""), ".livemd")

progress_path = __DIR__ <> "/../progress.json"
existing_progress = File.read!(progress_path) |> Jason.decode!()

default = Map.get(existing_progress, file_name, false)

form =
  Kino.Control.form(
    [
      completed: input = Kino.Input.checkbox("Mark As Completed", default: default)
    ],
    report_changes: true
  )

Task.async(fn ->
  for %{data: %{completed: completed}} <- Kino.Control.stream(form) do
    File.write!(progress_path, Jason.encode!(Map.put(existing_progress, file_name, completed)))
  end
end)

form

Commit Your Progress

Run the following in your command line from the curriculum folder to track and save your progress in a Git commit. Ensure that you do not already have undesired or unrelated changes by running git status or by checking the source control tab in Visual Studio Code.

$ git checkout solutions
$ git checkout -b deprecated-divide-and-conquer-reading
$ git add .
$ git commit -m "finish deprecated divide and conquer reading"
$ git push origin deprecated-divide-and-conquer-reading

Create a pull request from your deprecated-divide-and-conquer-reading branch to your solutions branch. Please do not create a pull request to the DockYard Academy repository as this will spam our PR tracker.

DockYard Academy Students Only:

Notify your instructor by including @BrooklinJazz in your PR description to get feedback. You (or your instructor) may merge your PR into your solutions branch after review.

If you are interested in joining the next academy cohort, sign up here to receive more news when it is available.