Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I map and group_by at the same time?

Tags:

elixir

As an example, let's say I have an enumerable collection of pairs {first, second}. Grouping these pairs using

Enum.group_by(collection, fn {first, second} -> first end)

will result in a Map whose keys are determined by the passed anonymous function. Its values are collections of pairs. However, I would like its values to contain the pair's second elements instead.


In general, given an enumerable, I would like to group providing both a key extractor and a value mapper, so that I can determine what gets put into the resulting Map's values. I.e., I would like something like

map_group_by(
  collection,
  fn {_first, second} -> second end,
  fn {first, _second} -> first end
)

where collection's values are mapped before being grouped, yet where the key mapper still operates on the original elements.

Is there such a function in the standard library? If not, what is the most idiomatic way to achieve this?


I know I could do something like

Enum.reduce(
  collection,
  %{},
  fn({key, value}, acc) -> Dict.update(acc, key, [value], &([value | &1])) end
)

but this seems clunky and creates [value] lists preemptively (is that actually true?). Is there a better way that is both concise and efficient?

like image 430
user4235730 Avatar asked Feb 15 '16 18:02

user4235730


Video Answer


2 Answers

Since Elixir 1.3 there is now Enum.group_by/3 that takes a mapper_fun argument, which solves exactly this problem:

Enum.group_by(enumerable, &elem(&1, 0), &elem(&1, 1))

Obsolete answer:

At this moment, there is no such function in the standard library. I ended up using this:

def map_group_by(enumerable, value_mapper, key_extractor) do
  Enum.reduce(Enum.reverse(enumerable), %{}, fn(entry, categories) ->
    value = value_mapper.(entry)
    Map.update(categories, key_extractor.(entry), [value], &[value | &1])
  end)
end

which can (for my example) then be called like this:

map_group_by(
  collection,
  fn {_, second} -> second end,
  fn {first, _} -> first end
)

It is adapted from the standard library's Enum.group_by. Regarding the [value]: I don't know what the compiler can or cannot optimize, but at least this is what Enum.group_by does as well.

Note the Enum.reverse call, which was not in the example from my question. This ensures that the element order is preserved in the resulting value lists. If you do not need that order to be preserved (like I did in my case, in which I only wanted to sample from the result anyway), it can be dropped.

like image 128
user4235730 Avatar answered Oct 18 '22 07:10

user4235730


Real answer

Since Elixir 1.3 there is now Enum.group_by/3 who's 3rd argument is a function that gets mapped over the keys.


Obsolete Answer

But I'll give you my solution:

To start off, It's important to notice, as you see in Elixir Docs that a list of tuples is the same as a key-value list:

iex> list = [{:a, 1}, {:b, 2}]
[a: 1, b: 2]
iex> list == [a: 1, b: 2]
true

So with this in mind it's easy to use the Enum.map across it.

This does make two passes it it but it's a little cleaner looking than what you had:

defmodule EnumHelpers do
  def map_col(lst) do
    lst
    |> Enum.group_by(fn {x, _} -> x end)
    |> Enum.map(fn {x, y} -> {x, Dict.values y} end)
  end
end

IO.inspect EnumHelpers.map_col([a: 2, a: 3, b: 3])

which will print out:

[a: [3, 2], b: [3]]

Edit: Faster Version:

defmodule EnumHelpers do

  defp group_one({key, val}, categories) do
    Dict.update(categories, key, [val], &[val|&1])
  end

  def map_col_fast(coll) do
    Enum.reduce(coll, %{}, &group_one/2)
  end
end

IO.inspect EnumHelpers.map_col_fast([a: 2, a: 3, b: 3])
like image 27
JustGage Avatar answered Oct 18 '22 07:10

JustGage