Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I ever prefer Enum to Stream in Elixir?

This may sounds like I'm begging to start a flame war, but hear me out.

In some languages laziness is expensive. For example, in Ruby, where I have the most recent experience, laziness is slow because it's achieved using fibers, so it's only attractive when:

  • you must trade off cpu for memory (think paging through large data set)
  • the performance penalty is worth it to hide details (yielding to fibers is a great way to abstract away complexity instead of passing down blocks to run in mysterious places)

Otherwise you'll definitely want to use the normal, eager methods.

My initial investigation suggests that the overhead for laziness in Elixir is much lower (this thread on reddit backs me up), so there seems little reason to ever use Enum instead of Stream for those things which Stream can do.

Is there something I'm missing, since I assume Enum exists for a reason and implements some of the same functions as Stream. What cases, if any, would I want to use Enum instead of Stream when I could use Stream?

like image 325
G Gordon Worley III Avatar asked Oct 31 '16 19:10

G Gordon Worley III


People also ask

When should I use Elixir stream?

Streams are useful when working with huge, potentially infinite data sets. With large data sets, streams are more suitable as they do not fill up the memory with all the data at once, which Enum would do ​due to its intermediate lists.

What does Enum at Do Elixir?

The Enum module provides a huge range of functions to transform, sort, group, filter and retrieve items from enumerables. It is one of the modules developers use frequently in their Elixir code. The functions in the Enum module are limited to, as the name says, enumerating values in data structures.

What is a stream Elixir?

Any enumerable that generates elements one by one during enumeration is called a stream. For example, Elixir's Range is a stream: iex> range = 1..


2 Answers

The methods in Stream essentially create a "recipe list" of transformations over your data while the methods in Enum actually resolve these transformations. So you eventually will have to use an Enum function to resolve your data transformation even if everything else is a Stream.

Also some concepts, namely Reduce, have no real meaning in Stream and you must use Enum.

As for performance, if you have a series of transformations you're performing, a possibly infinite stream of data, or you're reading a file, use Stream. If you've just one transformation over a finite enumerable or you need to resolve a Stream, use Enum.

like image 73
greggreg Avatar answered Oct 07 '22 05:10

greggreg


For short lists, Stream will be slower than simply using Enum, but there's no clear rule there without benchmarking exactly what you are doing. There are also some functions that exist in Enum, but don't have corresponding functions in Stream. (for example, Enum.reverse )

The real reason you need both is that Stream is just a composition of functions. Every pipeline that needs results, rather than side effects needs to end in an Enum to get the pipeline to run.

They go hand in hand, Stream couldn't stand alone. What Stream is largely doing is giving you a very handy abstraction for creating very complex reduce functions.

like image 26
Fred the Magic Wonder Dog Avatar answered Oct 07 '22 05:10

Fred the Magic Wonder Dog