I've been looking at the new Scala 2.9 parallel collections and am hoping to abandon a whole lot of my crufty amateur versions of similar things. In particular, I'd like to replace the fork join pool which underlies the default implementation with something of my own (for example, something that distributes evaluation of tasks across a network, via actors). My understanding is that this is simply a matter of applying Scala's paradigm of "stackable modifications", but the collections library is intimidating enough that I'm not exactly sure which bits need modifying!
Some concrete questions:
ForkJoinTasks
?FutureThreadPoolTasks
. How would I build a collection which uses this trait instead of ForkJoinTasks
?AdaptiveWorkStealingTasks
and somehow instantiate collections instances that use this new trait?(For reference, all of the traits mentioned above are defined in Tasks.scala.)
Especially code examples are very welcome!
ForkJoinPool It is an implementation of the ExecutorService that manages worker threads and provides us with tools to get information about the thread pool state and performance. Worker threads can execute only one task at a time, but the ForkJoinPool doesn't create a separate thread for every single subtask.
The design of Scala's parallel collections library is inspired by and deeply integrated with Scala's (sequential) collections library (introduced in 2.8). It provides a parallel counterpart to a number of important data structures from Scala's (sequential) collection library, including: ParArray. ParVector. mutable.
Just to provide some more information on how things fit together (which I suspect you already know): the fork-join pool is "plugged in" via the parallel
package object's tasksupport
value which implements the scala.collection.parallel.TaskSupport
trait.
This, in turn, inherits from Tasks
(which you mention) and defines such operations as:
def execute[R, Tp](fjtask: Task[R, Tp]): () => R
def executeAndWaitResult[R, Tp](task: Task[R, Tp]): R
However, it's not immediately obvious to me how you can override the behaviour which is explicitly imported by the collections themselves by supplying your own TaskSupport
implementation. For example, in ParSeqLike
line 47:
import tasksupport._
In fact,I would go so far as saying it looks like the parallelism is definitively not overridable (unless I am very much mistaken, though I often am).
Here is a document describing how to switch TaskSupport
objects in Scala 2.10.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With