I've found that counting with Enum.map |> Enum.sum
works much faster than Enum.count
. But why is not the built-in count function efficient?
Compare these two lines:
elements |> Enum.count(&check/1)
elements |> Enum.map(&(if check(&1), do: 1, else: 0)) |> Enum.sum
The results of tests with different length of list:
len | map(), µs | count(), µs
===============================
10 | 0.67 | 2.55
100 | 5.52 | 8.91
1000 | 59.00 | 73.12
10000 | 642.50 | 734.60
Source code: https://gist.github.com/artemrizhov/fc146f7ab390f7a807be833099e5cb83
$ elixir --version
Erlang/OTP 19 [erts-8.1] [source-e7be63d] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
Elixir 1.3.4
In Elixir, an enumerable is any data type that implements the Enumerable protocol. List s ( [1, 2, 3] ), Map s ( %{foo: 1, bar: 2} ) and Range s ( 1..3 ) are common data types used as enumerables: iex> Enum. map([1, 2, 3], fn x -> x * 2 end) [2, 4, 6] iex> Enum. sum([1, 2, 3]) 6 iex> Enum.
Streams are useful when working with huge, potentially infinite data sets. With large data sets, streams are more suitable as they do not fill up the memory with all the data at once, which Enum would do due to its intermediate lists.
This is because both Enum.map/2
and Enum.reduce/3
(that is used by sum
) are heavily optimised for lists, while Enum.count/2
has only a generic enumerable version.
To add to confusion, there's even a faster implementation:
elements |> Enum.filter(&check/1) |> Enum.count
On my machine, using modified benchmark you provided I get a consistent result of:
len = 10
map: 2.1706 μs
count: 7.0754 μs
filter: 1.9765 μs
len = 100
map: 19.921 μs
count: 27.579 μs
filter: 14.591 μs
len = 1000
map: 168.93 μs
count: 178.93 μs
filter: 163.43 μs
len = 10000
map: 2128.9 μs
count: 1822.1 μs
filter: 1664.6 μs
That said both the "map" and the "filter" version produce more garbage when they operate, so some of the time gains might be consumed by the extended garbage collection time. This is already visible in the benchmarks, as the relative gains between the versions decrease as the length of the length of the list (and the amount of intermediate garbage produced) increase.
EDIT: I submitted a PR that optimises the Enum.count/2
function to be the fastest one https://github.com/elixir-lang/elixir/pull/5567
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With