Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it a good idea to run `...par.map(` on large lists directly?

Let's say I have a somewhat large (several millions of items, or so) list of strings. Is it a good idea to run something like this:

val updatedList = myList.par.map(someAction).toList

Or would it be a better idea to group the list before running ...par.map(, like this:

val numberOfCores = Runtime.getRuntime.availableProcessors
val updatedList = 
  myList.grouped(numberOfCores).toList.par.map(_.map(someAction)).toList.flatten

UPDATE: Given that someAction is quite expensive (comparing to grouped, toList, etc.)

like image 275
Vilius Normantas Avatar asked Apr 07 '12 13:04

Vilius Normantas


1 Answers

Run par.map directly, as it already takes the number of cores into account. However, do not keep a List, as that requires a full copy to make into a parallel collection. Instead, use Vector.

like image 75
Daniel C. Sobral Avatar answered Oct 12 '22 11:10

Daniel C. Sobral