Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why stream fold operation throws Out of memory exception?

I have following simple code

 def fib(i:Long,j:Long):Stream[Long] = i #:: fib(j, i+j)
 (0l /: fib(1,1).take(10000000)) (_+_)

And it throws OutOfMemmoryError exception. I can not understand why, because I think all the parts use constant memmory i.e. lazy evaluation streams and foldLeft...

Those code also don't work

fib(1,1).take(10000000).sum or max, min e.t.c.

How to correctly implement infinite streams and do iterative operations upon it?

Scala version: 2.9.0

Also scala javadoc said, that foldLeft operation is memmory safe for streams

  /** Stream specialization of foldLeft which allows GC to collect
   *  along the way.
   */
  @tailrec
  override final def foldLeft[B](z: B)(op: (B, A) => B): B = {
    if (this.isEmpty) z
    else tail.foldLeft(op(z, head))(op)
  }

EDIT:

Implementation with iterators still not useful, since it throws ${domainName} exception

   def fib(i:Long,j:Long): Iterator[Long] = Iterator(i) ++  fib(j, i + j)

How to define correctly infinite stream/iterator in Scala?

EDIT2: I don't care about int overflow, I just want to understand how to create infinite stream/iterator etc in scala without side effects .

like image 502
yura Avatar asked Sep 05 '11 09:09

yura


2 Answers

The reason to use Stream instead of Iterator is so that you don't have to calculate all the small terms in the series over again. But this means that you need to store ten million stream nodes. These are pretty large, unfortunately, so that could be enough to overflow the default memory. The only realistic way to overcome this is to start with more memory (e.g. scala -J-Xmx2G). (Also, note that you're going to overflow Long by an enormous margin; the Fibonacci series increases pretty quickly.)

P.S. The iterator implementation I have in mind is completely different; you don't build it out of concatenated singleton Iterators:

def fib(i: Long, j: Long) = Iterator.iterate((i,j)){ case (a,b) => (b,a+b) }.map(_._1)

Now when you fold, past results can be discarded.

like image 128
Rex Kerr Avatar answered Oct 03 '22 11:10

Rex Kerr


The OutOfMemoryError happens indenpendently from the fact that you use Stream. As Rex Kerr mentioned above, Stream -- unlike Iterator -- stores everything in memory. The difference with List is that the elements of Stream are calculated lazily, but once you reach 10000000, there will be 10000000 elements, just like List.

Try with new Array[Int](10000000), you will have the same problem.

To calculate the fibonacci number as above you may want to use different approach. You can take into account the fact that you only need to have two numbers, instead of the whole fibonacci numbers discovered so far.

For example:

scala> def fib(i:Long,j:Long): Iterator[Long] = Iterator(i) ++  fib(j, i + j)
fib: (i: Long,j: Long)Iterator[Long]

And to get, for example, the index of the first fibonacci number exceeding 1000000:

scala> fib(1, 1).indexWhere(_ > 1000000)
res12: Int = 30

Edit: I added the following lines to cope with the StackOverflow

If you really want to work with 1 millionth fibonacci number, the iterator definition above will not work either for StackOverflowError. The following is the best I have in mind at the moment:

  class FibIterator extends Iterator[BigDecimal] {
       var i: BigDecimal = 1
       var j: BigDecimal = 1
       def next = {val temp = i 
                   i = i + j
                   j = temp  
                   j }
       def hasNext = true
    }
scala> new FibIterator().take(1000000).foldLeft(0:BigDecimal)(_ + _)
res49: BigDecimal = 82742358764415552005488531917024390424162251704439978804028473661823057748584031
0652444660067860068576582339667553466723534958196114093963106431270812950808725232290398073106383520
9370070837993419439389400053162345760603732435980206131237515815087375786729469542122086546698588361
1918333940290120089979292470743729680266332315132001038214604422938050077278662240891771323175496710
6543809955073045938575199742538064756142664237279428808177636434609546136862690895665103636058513818
5599492335097606599062280930533577747023889877591518250849190138449610994983754112730003192861138966
1418736269315695488126272680440194742866966916767696600932919528743675517065891097024715258730309025
7920682881137637647091134870921415447854373518256370737719553266719856028732647721347048627996967...
like image 44
anrizal - Anwar Rizal Avatar answered Oct 03 '22 11:10

anrizal - Anwar Rizal