Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Scala doing anything in parallel on its own?

I have little program creating a maze. It uses lots of collections (the default variant, which is immutable, or at least used as an immutable).

The program calculates 30 mazes with increasing dimensions. Using a for comprehension over (1 to 30)

Since with the latest versions the parallel collections framework became available I thought to give it a spin, hoping for some performance gain.

This failed and when I investigated a little, I found the following:

  1. When run without any call to anything remotely parallel it still showed a processor load of about 30% on each of the 4 cores of my machine.

  2. When I replaced the Range 1 to 30 with (1 to 30).par CPU load went up to about 80% on all cores (which I expected). The order in which the mazes completed became more or less random (which I expected). The total time for all mazes stayed the same.

  3. Replacing some of the internally used collections with their parallel counter parts did seem to have an effect.

I now have 2 questions:

  • Why do I have all 4 cores spinning, although there isn't anything that runs in parallel.

  • What might be likely reasons for the program to still take the same time, no matter if running in parallel or not. There are no obvious other bottlenecks but CPU cycles (no IO, no Network, plenty of Memory via -Xmx setting)

Any ideas on this?

like image 873
Jens Schauder Avatar asked Jun 07 '11 19:06

Jens Schauder


1 Answers

The 30% per core version is just a poor scheduler (sounds like Windows 7) migrating the process from core to core very frequently. It's probably closer to 25% per core (1/4) for your process plus misc other load making 30%. If you run the same example under Linux you would probably see one core pegged.

When you converted to (1 to 30).par, you started really using threads across all cores but the synchronization overhead of distributing such a small amount of work and then collecting the results cancelled out the parallelism gains. You need to break your work into larger independent chunks.

EDIT: If each of 1..30 represents some larger amount of work (solving a maze, say) then automatic parallelization will work much better if each unit of work is about the same. Imagine you had 29 easy mazes and one very very hard maze. The 30th maze will still run serially (or very nearly) with everything else). If your mazes increase in complexity by number try spawning them in the order 30 to 1 by -1 so that the biggest tasks will go first. Think of it as a braindead solution to the knapsack problem.

like image 184
Ben Jackson Avatar answered Oct 25 '22 13:10

Ben Jackson