It seems like when I invoke <code>map</code> on a parallel list, the operation runs in parallel, but when I do <code>filter</code> on that list, the operation runs strictly sequentially. So to make <code>filter</code> parallel, I first do map to (A,Boolean), then filter those tuples, and map all back again. It feels not very convenient. So I am interested - which operations on parallel collections are parallelized and which are not?

There are no parallel lists. Calling <code>par</code> on a <code>List</code> converts the <code>List</code> into the default parallel immutable sequence - a <code>ParVector</code>. This conversion proceeds sequentially. Both the <code>filter</code> and the <code>map</code> should then be parallel. <pre class="prettyprint"><code>scala> import scala.collection._ import scala.collection._ scala> List(1, 2, 3).par.filter { x => println(Thread.currentThread); x > 0 } Thread[ForkJoinPool-1-worker-5,5,main] Thread[ForkJoinPool-1-worker-3,5,main] Thread[ForkJoinPool-1-worker-0,5,main] res0: scala.collection.parallel.immutable.ParSeq[Int] = ParVector(1, 2, 3) </code></pre> Perhaps you've concluded that the <code>filter</code> is not parallel, because you've measured both the conversion time and the <code>filter</code> time. Some operations not parallelized currently: <code>sort*</code> variants, <code>indexOfSlice</code>.

Which operations on Scala parallel collections are parallelized?

Tags:

collections

parallel-processing

scala

It seems like when I invoke map on a parallel list, the operation runs in parallel, but when I do filter on that list, the operation runs strictly sequentially. So to make filter parallel, I first do map to (A,Boolean), then filter those tuples, and map all back again. It feels not very convenient.

So I am interested - which operations on parallel collections are parallelized and which are not?

746

asked Oct 05 '11 15:10

Rogach

1 Answers

There are no parallel lists. Calling par on a List converts the List into the default parallel immutable sequence - a ParVector. This conversion proceeds sequentially. Both the filter and the map should then be parallel.

scala> import scala.collection._
import scala.collection._

scala> List(1, 2, 3).par.filter { x => println(Thread.currentThread); x > 0 }
Thread[ForkJoinPool-1-worker-5,5,main]
Thread[ForkJoinPool-1-worker-3,5,main]
Thread[ForkJoinPool-1-worker-0,5,main]
res0: scala.collection.parallel.immutable.ParSeq[Int] = ParVector(1, 2, 3)

Perhaps you've concluded that the filter is not parallel, because you've measured both the conversion time and the filter time.

Some operations not parallelized currently: sort* variants, indexOfSlice.

103

answered Oct 04 '22 19:10

axel22

Related questions
                            
                                Scala Slick / ScalaQuery BigDecimal creates decimal(10,0) how to allow decimals?
                            
                                Scala play - "not found: value routes" (Eclipse and IDEA)
                            
                                Converting JsValue to String
                            
                                play 2: "reference to form is ambiguous" error message in template
                            
                                stacking StateT in scalaz
                            
                                Spark Streaming Accumulated Word Count
                            
                                Purpose of `render` in json4s
                            
                                Custom outputPath for sbt-assembly
                            
                                How can I publish or subscribe to a materialized Akka Stream flow graph?
                            
                                Scala Shapeless Code for Project Euler #1
                            
                                Cant move to next line while reading csv file
                            
                                About Future.firstCompletedOf and Garbage Collect mechanism
                            
                                How to bring jenkins job to fail when Gatling load tests underperform
                            
                                Add an item in a Seq in scala
                            
                                Implicit conversions weirdness
                            
                                Why does the Spark DataFrame conversion to RDD require a full re-mapping?
                            
                                Scala passing type parameters to object
                            
                                Scala/Lift Framework runs just over jetty web server?
                            
                                Pros and Cons of choosing def over val
                            
                                Eclipse: How can I install an older plugin version from an update site?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With