Enumerable
Enumerable protocol used by Enum
and Stream
modules.
When you invoke a function in the Enum
module, the first argument
is usually a collection that must implement this protocol.
For example, the expression Enum.map([1, 2, 3], &(&1 * 2))
invokes Enumerable.reduce/3
to perform the reducing operation that
builds a mapped list by calling the mapping function &(&1 * 2)
on
every element in the collection and consuming the element with an
accumulated list.
Internally, Enum.map/2
is implemented as follows:
def map(enumerable, fun) do
reducer = fn x, acc -> {:cont, [fun.(x) | acc]} end
Enumerable.reduce(enumerable, {:cont, []}, reducer) |> elem(1) |> :lists.reverse()
end
Note that the user-supplied function is wrapped into a t:reducer/0
function.
The t:reducer/0
function must return a tagged tuple after each step,
as described in the t:acc/0
type. At the end, Enumerable.reduce/3
returns t:result/0
.
This protocol uses tagged tuples to exchange information between the
reducer function and the data type that implements the protocol. This
allows enumeration of resources, such as files, to be done efficiently
while also guaranteeing the resource will be closed at the end of the
enumeration. This protocol also allows suspension of the enumeration,
which is useful when interleaving between many enumerables is required
(as in the zip/1
and zip/2
functions).
This protocol requires four functions to be implemented, reduce/3
,
count/1
, member?/2
, and slice/1
. The core of the protocol is the
reduce/3
function. All other functions exist as optimizations paths
for data structures that can implement certain properties in better
than linear time.
Function count/1
Retrieves the number of elements in the enumerable
.
It should return {:ok, count}
if you can count the number of elements
in enumerable
without traversing it.
Otherwise it should return {:error, __MODULE__}
and a default algorithm
built on top of reduce/3
that runs in linear time will be used.
Function member?/2
Checks if an element
exists within the enumerable
.
It should return {:ok, boolean}
if you can check the membership of a
given element in enumerable
with ===/2
without traversing the whole
of it.
Otherwise it should return {:error, __MODULE__}
and a default algorithm
built on top of reduce/3
that runs in linear time will be used.
When called outside guards, the in
and not in
operators work by using this function.
Function reduce/3
Reduces the enumerable
into an element.
Most of the operations in Enum
are implemented in terms of reduce.
This function should apply the given t:reducer/0
function to each
element in the enumerable
and proceed as expected by the returned
accumulator.
See the documentation of the types t:result/0
and t:acc/0
for
more information.
Examples
As an example, here is the implementation of reduce
for lists:
def reduce(_list, {:halt, acc}, _fun), do: {:halted, acc}
def reduce(list, {:suspend, acc}, fun), do: {:suspended, acc, &reduce(list, &1, fun)}
def reduce([], {:cont, acc}, _fun), do: {:done, acc}
def reduce([head | tail], {:cont, acc}, fun), do: reduce(tail, fun.(head, acc), fun)
Function slice/1
Returns a function that slices the data structure contiguously.
It should return {:ok, size, slicing_fun}
if the enumerable
has
a known bound and can access a position in the enumerable
without
traversing all previous elements.
Otherwise it should return {:error, __MODULE__}
and a default
algorithm built on top of reduce/3
that runs in linear time will be
used.
Differences to count/1
The size
value returned by this function is used for boundary checks,
therefore it is extremely important that this function only returns :ok
if retrieving the size
of the enumerable
is cheap, fast and takes constant
time. Otherwise the simplest of operations, such as Enum.at(enumerable, 0)
,
will become too expensive.
On the other hand, the count/1
function in this protocol should be
implemented whenever you can count the number of elements in the collection without
traversing it.
count/1
Retrieves the number of elements in the enumerable
.
It should return {:ok, count}
if you can count the number of elements
in enumerable
without traversing it.
Otherwise it should return {:error, __MODULE__}
and a default algorithm
built on top of reduce/3
that runs in linear time will be used.
member?/2
Checks if an element
exists within the enumerable
.
It should return {:ok, boolean}
if you can check the membership of a
given element in enumerable
with ===/2
without traversing the whole
of it.
Otherwise it should return {:error, __MODULE__}
and a default algorithm
built on top of reduce/3
that runs in linear time will be used.
When called outside guards, the in
and not in
operators work by using this function.
reduce/3
Reduces the enumerable
into an element.
Most of the operations in Enum
are implemented in terms of reduce.
This function should apply the given t:reducer/0
function to each
element in the enumerable
and proceed as expected by the returned
accumulator.
See the documentation of the types t:result/0
and t:acc/0
for
more information.
Examples
As an example, here is the implementation of reduce
for lists:
def reduce(_list, {:halt, acc}, _fun), do: {:halted, acc}
def reduce(list, {:suspend, acc}, fun), do: {:suspended, acc, &reduce(list, &1, fun)}
def reduce([], {:cont, acc}, _fun), do: {:done, acc}
def reduce([head | tail], {:cont, acc}, fun), do: reduce(tail, fun.(head, acc), fun)
slice/1
Returns a function that slices the data structure contiguously.
It should return {:ok, size, slicing_fun}
if the enumerable
has
a known bound and can access a position in the enumerable
without
traversing all previous elements.
Otherwise it should return {:error, __MODULE__}
and a default
algorithm built on top of reduce/3
that runs in linear time will be
used.
Differences to count/1
The size
value returned by this function is used for boundary checks,
therefore it is extremely important that this function only returns :ok
if retrieving the size
of the enumerable
is cheap, fast and takes constant
time. Otherwise the simplest of operations, such as Enum.at(enumerable, 0)
,
will become too expensive.
On the other hand, the count/1
function in this protocol should be
implemented whenever you can count the number of elements in the collection without
traversing it.
Type acc
The accumulator value for each step.
It must be a tagged tuple with one of the following “tags”:
-
:cont
- the enumeration should continue -
:halt
- the enumeration should halt immediately -
:suspend
- the enumeration should be suspended immediately
Depending on the accumulator value, the result returned byEnumerable.reduce/3
will change. Please check the t:result/0
type documentation for more information.
In case a t:reducer/0
function returns a :suspend
accumulator,
it must be explicitly handled by the caller and never leak.
Type reducer
The reducer function.
Should be called with the enumerable
element and the
accumulator contents.
Returns the accumulator for the next enumeration step.
Type result
The result of the reduce operation.
It may be done when the enumeration is finished by reaching its end, or halted/suspended when the enumeration was halted or suspended by the tagged accumulator.
In case the tagged :halt
accumulator is given, the :halted
tuple
with the accumulator must be returned. Functions like Enum.take_while/2
use :halt
underneath and can be used to test halting enumerables.
In case the tagged :suspend
accumulator is given, the caller must
return the :suspended
tuple with the accumulator and a continuation.
The caller is then responsible of managing the continuation and the
caller must always call the continuation, eventually halting or continuing
until the end. Enum.zip/2
uses suspension, so it can be used to test
whether your implementation handles suspension correctly. You can also use
Stream.zip/2
with Enum.take_while/2
to test the combination of
:suspend
with :halt
.
Type continuation
A partially applied reduce function.
The continuation is the closure returned as a result when the enumeration is suspended. When invoked, it expects a new accumulator and it returns the result.
A continuation can be trivially implemented as long as the reduce function is defined in a tail recursive fashion. If the function is tail recursive, all the state is passed as arguments, so the continuation is the reducing function partially applied.
Type slicing_fun
A slicing function that receives the initial position and the number of elements in the slice.
The start
position is a number >= 0
and guaranteed to
exist in the enumerable
. The length is a number >= 1
in a way
that start + length <= count
, where count
is the maximum
amount of elements in the enumerable.
The function should return a non empty list where
the amount of elements is equal to length
.