Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using scala parallelism when iterating over a java converted List to immutable

I'm attempting to speed up the execution time using scala parallelism.

So to convert a java ArrayList to an immutable one I use :

var imList = scala.collection.JavaConversions.asScalaBuffer(normalQLFolderList)

and then to take advantage of multiple cores when iterating I use :

for (i <- imList .par) {
}

Am I taking advantage of scala parallelism in the correct way ? In this case iterating over a list. Is there a large performance hit on asScalaBuffer ?

like image 899
blue-sky Avatar asked Feb 22 '13 11:02

blue-sky


1 Answers

Collections which can be converted into their parallel counterparts in constant time include mutable and immutable hash maps and hash sets, ranges, vectors and arrays. For all other collection types, including wrappers around collections coming from Java, calling par results in copying the contents of the collection into a format more suitable for parallelization.

This is described here in more detail:

http://docs.scala-lang.org/overviews/parallel-collections/conversions.html

However, depending on how big the collection is, and how expensive the for block is, it might be perfectly reasonable to pay for this conversion. The more processing the parallel for block does per each element, the more the cost of the conversion is amortized.

I would say that if the computation per each element involves anything nontrivial (e.g. it at least creates new objects) paying for the conversion makes sense, but a good idea is to measure the performance difference between the sequential version and the parallel version which includes calling par:

http://docs.scala-lang.org/overviews/parallel-collections/performance.html

like image 187
axel22 Avatar answered Sep 28 '22 15:09

axel22