Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I split a list by keyword in Elixir

Tags:

elixir

Let's say I have a list of words, where a keyword, in this case "stop", demarcates full sentences:

["Hello", "from", "Paris", "stop", "Weather", "is", "sunny", "stop", "Missing", "you", "stop"]

which I want to turn into:

[["Hello", "from", "Paris"], ["Weather", "is", "sunny"], ["Missing", "you"]]

I know I can do this with strings with String.split, but ideally I'd like to learn how to tackle the above problem with fundamental functional constructs, such as recursion on [head|tail] etc, but I cannot figure out where to start on how to accumulate intermediate lists.

like image 766
Thomas Browne Avatar asked Aug 22 '16 20:08

Thomas Browne


2 Answers

You can use chunk_by/2:

["Hello", "from", "Paris", "stop", "Weather", "is", "sunny", "stop", "Missing", "you", "stop"]    
|> Enum.chunk_by(fn(x) -> x != "stop" end) 
|> Enum.reject(fn(x) -> x == ["stop"] end)

Performance

Out of curiosity, I wanted to benchmark the performance of the implementations given to this question. The benchmark was for 100,000 calls of each implementation and I ran it 3 times. Here are the results if someone is interested:

0.292903s | 0.316024s | 0.292106s | chunk_by

0.168113s | 0.152456s | 0.151854s | Main.main (@Dogbert's answer)

0.167387s | 0.148059s | 0.143763s | chunk_on (@Martin Svalin's answer)

0.177080s | 0.180632s | 0.185636s | splitter (@stephen_m's answer)

like image 129
AbM Avatar answered Sep 19 '22 17:09

AbM


Here's a simple tail-recursive implementation using pattern matching:

defmodule Main do
  def split_on(list, on) do
    list
    |> Enum.reverse
    |> do_split_on(on, [[]])
    |> Enum.reject(fn list -> list == [] end)
  end

  def do_split_on([], _, acc), do: acc
  def do_split_on([h | t], h, acc), do: do_split_on(t, h, [[] | acc])
  def do_split_on([h | t], on, [h2 | t2]), do: do_split_on(t, on, [[h | h2] | t2])

  def main do
    ["Hello", "from", "Paris", "stop", "Weather", "is", "sunny", "stop", "Missing", "you", "stop"]
    |> split_on("stop")
    |> IO.inspect
  end
end

Main.main

Output:

[["Hello", "from", "Paris"], ["Weather", "is", "sunny"], ["Missing", "you"]]
like image 21
Dogbert Avatar answered Sep 17 '22 17:09

Dogbert