Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use-cases for Streams in Scala

In Scala there is a Stream class that is very much like an iterator. The topic Difference between Iterator and Stream in Scala? offers some insights into the similarities and differences between the two.

Seeing how to use a stream is pretty simple but I don't have very many common use-cases where I would use a stream instead of other artifacts.

The ideas I have right now:

  • If you need to make use of an infinite series. But this does not seem like a common use-case to me so it doesn't fit my criteria. (Please correct me if it is common and I just have a blind spot)
  • If you have a series of data where each element needs to be computed but that you may want to reuse several times. This is weak because I could just load it into a list which is conceptually easier to follow for a large subset of the developer population.
  • Perhaps there is a large set of data or a computationally expensive series and there is a high probability that the items you need will not require visiting all of the elements. But in this case an Iterator would be a good match unless you need to do several searches, in that case you could use a list as well even if it would be slightly less efficient.
  • There is a complex series of data that needs to be reused. Again a list could be used here. Although in this case both cases would be equally difficult to use and a Stream would be a better fit since not all elements need to be loaded. But again not that common... or is it?

So have I missed any big uses? Or is it a developer preference for the most part?

Thanks

like image 540
Jesse Eichar Avatar asked Jan 19 '10 20:01

Jesse Eichar


People also ask

What is a stream in Scala?

The Stream is a lazy lists where elements are evaluated only when they are needed. This is a scala feature. Scala supports lazy computation. It increases performance of our program. Streams have the same performance characteristics as lists.

Is Scala iterator lazy?

Unlike operations directly on a concrete collection like List , operations on Iterator are lazy. A lazy operation does not immediately compute all of its results.


2 Answers

The main difference between a Stream and an Iterator is that the latter is mutable and "one-shot", so to speak, while the former is not. Iterator has a better memory footprint than Stream, but the fact that it is mutable can be inconvenient.

Take this classic prime number generator, for instance:

def primeStream(s: Stream[Int]): Stream[Int] =   Stream.cons(s.head, primeStream(s.tail filter { _ % s.head != 0 })) val primes = primeStream(Stream.from(2)) 

It can be easily be written with an Iterator as well, but an Iterator won't keep the primes computed so far.

So, one important aspect of a Stream is that you can pass it to other functions without having it duplicated first, or having to generate it again and again.

As for expensive computations/infinite lists, these things can be done with Iterator as well. Infinite lists are actually quite useful -- you just don't know it because you didn't have it, so you have seen algorithms that are more complex than strictly necessary just to deal with enforced finite sizes.

like image 138
Daniel C. Sobral Avatar answered Sep 30 '22 03:09

Daniel C. Sobral


In addition to Daniel's answer, keep in mind that Stream is useful for short-circuiting evaluations. For example, suppose I have a huge set of functions that take String and return Option[String], and I want to keep executing them until one of them works:

val stringOps = List(   (s:String) => if (s.length>10) Some(s.length.toString) else None ,   (s:String) => if (s.length==0) Some("empty") else None ,   (s:String) => if (s.indexOf(" ")>=0) Some(s.trim) else None ); 

Well, I certainly don't want to execute the entire list, and there isn't any handy method on List that says, "treat these as functions and execute them until one of them returns something other than None". What to do? Perhaps this:

def transform(input: String, ops: List[String=>Option[String]]) = {   ops.toStream.map( _(input) ).find(_ isDefined).getOrElse(None) } 

This takes a list and treats it as a Stream (which doesn't actually evaluate anything), then defines a new Stream that is a result of applying the functions (but that doesn't evaluate anything either yet), then searches for the first one which is defined--and here, magically, it looks back and realizes it has to apply the map, and get the right data from the original list--and then unwraps it from Option[Option[String]] to Option[String] using getOrElse.

Here's an example:

scala> transform("This is a really long string",stringOps) res0: Option[String] = Some(28)  scala> transform("",stringOps) res1: Option[String] = Some(empty)  scala> transform("  hi ",stringOps) res2: Option[String] = Some(hi)  scala> transform("no-match",stringOps) res3: Option[String] = None 

But does it work? If we put a println into our functions so we can tell if they're called, we get

val stringOps = List(   (s:String) => {println("1"); if (s.length>10) Some(s.length.toString) else None },   (s:String) => {println("2"); if (s.length==0) Some("empty") else None },   (s:String) => {println("3"); if (s.indexOf(" ")>=0) Some(s.trim) else None } ); // (transform is the same)  scala> transform("This is a really long string",stringOps) 1 res0: Option[String] = Some(28)  scala> transform("no-match",stringOps)                     1 2 3 res1: Option[String] = None 

(This is with Scala 2.8; 2.7's implementation will sometimes overshoot by one, unfortunately. And note that you do accumulate a long list of None as your failures accrue, but presumably this is inexpensive compared to your true computation here.)

like image 26
Rex Kerr Avatar answered Sep 30 '22 05:09

Rex Kerr