Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Elixir's MapSet becomes unordered after 32 elements?

Tags:

set

elixir

iex> MapSet.new(1..32) |> Enum.to_list
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
 23, 24, 25, 26, 27, 28, 29, 30, 31, 32]

iex> MapSet.new(1..33) |> Enum.to_list
[11, 26, 15, 20, 17, 25, 13, 8, 7, 1, 32, 3, 6, 2, 33, 10, 9, 19, 14, 5, 18, 31,
 22, 29, 21, 27, 24, 30, 23, 28, 16, 4, 12]

Here's the implementation in Elixir 1.3

def new(enumerable) do
  map =
    enumerable
    |> Enum.to_list
    |> do_new([])

  %MapSet{map: map}
end

defp do_new([], acc) do
  acc
  |> :lists.reverse
  |> :maps.from_list
end

defp do_new([item | rest], acc) do
  do_new(rest, [{item, true} | acc])
end

Even though the order doesn't matter in a MapSet, but still wondering why a MapSet becomes unordered after 32 elements?

like image 904
sbs Avatar asked Jul 15 '16 00:07

sbs


1 Answers

This is not specific to MapSet, but the same thing happens with normal Map (MapSet uses Map under the hood):

iex(1)> for i <- Enum.shuffle(1..32), into: %{}, do: {i, i}
%{1 => 1, 2 => 2, 3 => 3, 4 => 4, 5 => 5, 6 => 6, 7 => 7, 8 => 8, 9 => 9,
  10 => 10, 11 => 11, 12 => 12, 13 => 13, 14 => 14, 15 => 15, 16 => 16,
  17 => 17, 18 => 18, 19 => 19, 20 => 20, 21 => 21, 22 => 22, 23 => 23,
  24 => 24, 25 => 25, 26 => 26, 27 => 27, 28 => 28, 29 => 29, 30 => 30,
  31 => 31, 32 => 32}
iex(2)> for i <- Enum.shuffle(1..33), into: %{}, do: {i, i}
%{11 => 11, 26 => 26, 15 => 15, 20 => 20, 17 => 17, 25 => 25, 13 => 13, 8 => 8,
  7 => 7, 1 => 1, 32 => 32, 3 => 3, 6 => 6, 2 => 2, 33 => 33, 10 => 10, 9 => 9,
  19 => 19, 14 => 14, 5 => 5, 18 => 18, 31 => 31, 22 => 22, 29 => 29, 21 => 21,
  27 => 27, 24 => 24, 30 => 30, 23 => 23, 28 => 28, 16 => 16, 4 => 4, 12 => 12}

This is because (most likely as an optimization) Erlang stores Maps of size upto MAP_SMALL_MAP_LIMIT as a sorted by key array. Only after the size is greater than MAP_SMALL_MAP_LIMIT Erlang switches to storing the data in a Hash Array Mapped Trie like data structure. In non-debug mode Erlang, MAP_SMALL_MAP_LIMIT is defined to be 32, so all maps with length upto 32 should print in sorted order. Note that this is an implementation detail as far as I know, and you should not rely on this behavior; they may change the value of the constant in the future or switch to a completely different algorithm if it's more performant.

like image 56
Dogbert Avatar answered Nov 10 '22 00:11

Dogbert