Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reducing a large stream without stack overflowing

1I'm trying to make a limit-less factorial function (just out of curiosity.) This works for large n (tried up to 100000 and it seems to work, although I can't check the output value for correctness because, well, it's LARGE!)

(BigInt(1) to n).reduceRight(_*_)

But I'm afraid that the whole BigInt(1) to n range might be in memory, while I just need it element by element for reduceRight. Taking a look at Scala's standard library code it looks like BigInt(1) to n actually outputs the whole Range and not a lazy Stream but I found Stream.range which I can use like this (notice the n+1, stream range is exclusive)

Stream.range[BigInt](BigInt(1), BigInt(n+1)).reduceRight(_*_)

It works for n=10000 (for some reason it takes a little bit longer! which makes me think that maybe the normal range is actually a Stream too?) but for n=100000 I get this stack overflow

java.lang.StackOverflowError
    at java.math.BigInteger.add(Unknown Source)
    at scala.math.BigInt.$plus(BigInt.scala:161)
    at scala.math.Numeric$BigIntIsIntegral$class.plus(Numeric.scala:28)
    at scala.math.Numeric$BigIntIsIntegral$.plus(Numeric.scala:40)
    at scala.math.Numeric$BigIntIsIntegral$.plus(Numeric.scala:40)
    at scala.math.Numeric$Ops.$plus(Numeric.scala:208)
    at scala.collection.immutable.Stream$$anonfun$range$1.apply(Stream.scala:695)
    at scala.collection.immutable.Stream$$anonfun$range$1.apply(Stream.scala:695)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:634)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:626)
    at scala.collection.LinearSeqOptimized$class.reduceRight(LinearSeqOptimized.scala:130)
    at scala.collection.immutable.Stream.reduceRight(Stream.scala:47)
    at scala.collection.LinearSeqOptimized$class.reduceRight(LinearSeqOptimized.scala:131)
    at scala.collection.immutable.Stream.reduceRight(Stream.scala:47)
    at scala.collection.LinearSeqOptimized$class.reduceRight(LinearSeqOptimized.scala:131)
    at scala.collection.immutable.Stream.reduceRight(Stream.scala:47)
    ...

It's obvious that reduceRight is calling itself like this

reduceRight(reduceRight(first, second, op), third, op)...

And thus the stack overflow. I assume it's not tail-call optimized because it first reduces and then operates prior to returning the value, so it can't be optimized. How could I approach this problem then? Should I ditch the functional approach and aim for custom imperative-style code for reducing?

What strikes me as a very odd thing is that the (BigInt(1) to n).reduceRight(_*_) does not overflow for large n while almost the same using a stream does... What's going on here?

like image 493
kaoD Avatar asked Apr 19 '12 14:04

kaoD


4 Answers

You are correct that your first solution will create a list in memory to store the reversed sequence. You could simply use reduceLeft (which doesn't have that problem on ranges), but that will pass through the range in the opposite direction. If for some reason you want to start at the large end but keep the laziness of reduceLeft, you can create a backwards Range:

def fact(n: Int) = (BigInt(n) to 1 by -1).reduceLeft(_ * _)

There may be other ways you can easily optimize this function.

like image 106
Travis Brown Avatar answered Nov 06 '22 03:11

Travis Brown


reduceLeft is implemented to compute as it goes on streams (and it calls in order). You can verify as follows:

Stream.range(1,10).map(i => print(i)).reduceLeft((a,b) => println("r"))

Or you can use tail recursion:

final def fac(n: BigInt, prod: BigInt = BigInt(1)): BigInt = {
  if (n<2) prod else fac(n-1,prod*n)
}

(as Travis points out, it'd be faster to multiply the small numbers first, which would take an extra line).

like image 35
Rex Kerr Avatar answered Nov 06 '22 03:11

Rex Kerr


Try using reduceLeft instead. reduceRight starts from the end of the stream and thus forces every element to be evaluated — exactly what you wanted to avoid.

like image 4
Jean-Philippe Pellet Avatar answered Nov 06 '22 04:11

Jean-Philippe Pellet


def fac (n: BigInt=1, m:BigInt=1): Stream[BigInt] = 
  Stream.cons (n, fac (n * m, m+1))

fac().take (10).toList
res27: List[BigInt] = List(1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880)

works well with take(10000).toList too.

like image 1
user unknown Avatar answered Nov 06 '22 03:11

user unknown