Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is this scala parallel array code threadsafe?

I want to use parallel arrays for a task, and before I start with the coding, I'd be interested in knowing if this small snipept is threadsafe:

import collection.mutable._

var listBuffer = ListBuffer[String]("one","two","three","four","five","six","seven","eight","nine")
var jSyncList  = java.util.Collections.synchronizedList(new java.util.ArrayList[String]())
listBuffer.par.foreach { e =>
    println("processed :"+e)
    // using sleep here to simulate a random delay
    Thread.sleep((scala.math.random * 1000).toLong)
    jSyncList.add(e)
}
jSyncList.toArray.foreach(println)

Are there better ways of processing something with parallel collections, and acumulating the results elsewhere?

like image 895
Geo Avatar asked May 07 '11 11:05

Geo


2 Answers

The code you posted is perfectly safe; I'm not sure about the premise though: why do you need to accumulate the results of a parallel collection in a non-parallel one? One of the whole points of the parallel collections is that they look like other collections.

I think that parallel collections also will provide a seq method to switch to sequential ones. So you should probably use this!

like image 182
oxbow_lakes Avatar answered Nov 15 '22 07:11

oxbow_lakes


For this pattern to be safe:

listBuffer.par.foreach { e => f(e) }

f has to be able to run concurrently in a safe way. I think the same rules that you need for safe multi-threading apply (access to share state needs to be thread safe, the order of the f calls for different e won't be deterministic and you may run into deadlocks as you start synchronizing your statements in f).

Additionally I'm not clear what guarantees the parallel collections gives you about the underlying collection being modified while being processed, so a mutable list buffer which can have elements added/removed is possibly a poor choice. You never know when the next coder will call something like foo(listBuffer) before your foreach and pass that reference to another thread which may mutate the list while it's being processed.

Other than that, I think for any f that will take a long time, can be called concurrently and where e can be processed out of order, this is a fine pattern.

immutCol.par.foreach { e => threadSafeOutOfOrderProcessingOf(e) }

disclaimer: I have not tried // colls myself, but I'm looking forward at having SO questions/answers show us what works well.

like image 25
huynhjl Avatar answered Nov 15 '22 07:11

huynhjl