I have a very large List of numbers, which undergo lots of math manipulation. I only care about the final result. To simulate this behavior, see my example code below:
object X {
def main(args:Array[String]) = {
val N = 10000000
val x = List(1 to N).flatten
println(x.slice(0,10))
Thread.sleep( 5000)
val y = x.map(_*5)
println(y.slice(0,10))
Thread.sleep( 5000)
val z = y.map( _+4)
println(z.slice(0,10))
Thread.sleep( 5000)
}
}
So x is a very large list. I care only about the result z. To obtain z, I first have to mathematically manipulate x to get y. Then I manipulate y to get z. ( I cannot go from x to z in one step, because the manipulations are quite complicated. This is just an example. )
So when I run this example, I run out of memory presumably because x, y and z are all in scope and they all occupy memory.
So I try the following:
def main(args:Array[String]) = {
val N = 10000000
val z = {
val y = {
val x = List(1 to N).flatten
println(x.slice(0,10))
Thread.sleep( 5000)
x
}.map(_*5)
println(y.slice(0,10))
Thread.sleep( 5000)
y
}.map( _+4)
println(z.slice(0,10))
Thread.sleep(5000)
}
So now only z is in scope. So presumably x and y are created and then garbage collected when they go out of scope. But this isn't what happens. Instead, I again run out of memory!
( Note: I am using java -Xincgc, but it doesn't help )
Question: When I have adequate memory for only 1 large list, can I somehow manipulate it using only val's ( ie. no mutable vars or ListBuffers ), maybe using scoping to force gc ? If so, how ? Thanks
Have you tried something like this?
val N = 10000000
val x = List(1 to N).flatten.view // get a view
val y = x.map(_ * 5)
val z = y.map(_ + 4)
println(z.force.slice(0, 10))
It should help avoiding creating the intermediate full structure for y
and z
.
Look at using view
. It takes a collection and lazily loads it, only calculates the value when required. It doesn't form an intermediate collection:
scala> (1 to 5000000).map(i => {i*i}).map(i=> {i*2}) .toList
java.lang.OutOfMemoryError: Java heap space
at java.lang.Integer.valueOf(Integer.java:625)
at scala.runtime.BoxesRunTime.boxToInteger(Unknown Source)
at scala.collection.immutable.Range.foreach(Range.scala:75)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:194)
at scala.collection.immutable.Range.map(Range.scala:43)
at .<init>(<console>:8)
at .<clinit>(<console>)
at .<init>(<console>:11)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:704)
at scala.tools.nsc.interpreter.IMain$Request$$anonfun$14.apply(IMain.scala:920)
at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)
at scala.tools.nsc.io.package$$anon$2.run(package.scala:25)
at java.lang.Thread.run(Thread.java:662)
scala> (1 to 5000000).view.map(i => {i*i}).view.map(i=> {i*2}) .toList
res10: List[Int] = List(2, 8, 18, 32, 50, 72, ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With