This takes around 1 second
(1 to 1000000).map(_+3)
While this gives java.lang.OutOfMemoryError: Java heap space
(1 to 1000000).par.map(_+3)
EDIT:
I have standard scala 2.9.2 configuration. I am typing this on scala prompt. And in the bash i can see [ -n "$JAVA_OPTS" ] || JAVA_OPTS="-Xmx256M -Xms32M"
AND i dont have JAVA_OPTS set in my env.
1 million integers = 8MB, creating list twice = 16MB
It seems definitely related to the JVM memory option and to the memory required to stock a Parralel collection. For example:
scala> (1 to 1000000).par.map(_+3)
ends up with a OutOfMemoryError
the third time I tried to evaluate it, while
scala> (1 to 1000000).par.map(_+3).seq
never failed. The issue is not the computation its the storage of the Parrallel collection.
Several reasons for the failure:
map
means that the range is converted into a vector. For parallel vectors an efficient concatenation has not been implemented yet, so merging intermediate vectors produced by different processors proceeds by copying - requiring more memory. This will be addressed in future releases.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With