Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Functional processing of Scala streams without OutOfMemory errors

Tags:

Is it possible to apply functional programming to Scala streams such that the stream is processed sequentially, but the already processed part of the stream can be garbage collected?

For example, I define a Stream that contains the numbers from start to end:

def fromToStream(start: Int, end: Int) : Stream[Int] = {
  if (end < start) Stream.empty
  else start #:: fromToStream(start+1, end)
}

If I sum up the values in a functional style:

println(fromToStream(1,10000000).reduceLeft(_+_))

I get an OutOfMemoryError - perhaps since the stackframe of the call to reduceLeft holds a reference to the head of the stream. But if I do this in iterative style, it works:

var sum = 0
for (i <- fromToStream(1,10000000)) {
  sum += i
}

Is there a way to do this in a functional style without getting an OutOfMemory?

UPDATE: This was a bug in scala that is fixed now. So this is more or less out of date now.

like image 308
Hans-Peter Störr Avatar asked Nov 09 '10 11:11

Hans-Peter Störr


1 Answers

When I started learning about Stream I thought this was cool. Then I realized Iterator is what I want to use nearly all the time.

In case you do need Stream but want to make reduceLeft work:

fromToStream(1,10000000).toIterator.reduceLeft(_ + _)

If you try the line above, it will garbage collect just fine. I have found that using Stream is tricky as it's easy to hold on to the head without realizing it. Sometimes the standard lib will hold on to it for you - in very subtle ways.

like image 85
huynhjl Avatar answered Oct 20 '22 05:10

huynhjl