Edit: My sample size was too small. When I ran it against the real data on 8 CPU's, I saw a 7.2x speed increase. Not too shabby for adding 4 characters to my code ;)
I am currently in the process of trying to 'sell' management on the benefits of using Scala, especially when it comes to scaling with CPU's. To that end, I created a simple test application that does a bunch of vector math and was a bit surprised to find that the runtime was not noticably better on my quad-core machine. Interestingly enough, I found that the runtime is the worst the first time you go through the collection and gets better with subsequent calls. Are there some lazy things in the parallel collection that are causing this, or am I just doing this wrong? It should be noted that I come from the C++/C# world, so it's entirely possible that I have messed up my configuration somehow. Regardless, here's my setup:
InteliJ Scala Plugin
Scala 2.9.1.final
Windows 7 64 bit, Quad-Core Processor (no hyperthreading)
import util.Random
// simple Vector3D class that has final x,y,z components a length, and a '-' function
class Vector3D(val x:Double, val y:Double, val z:Double)
{
def length = math.sqrt(x*x+y*y+z*z)
def -(rhs : Vector3D ) = new Vector3D(x - rhs.x, y - rhs.y, z - rhs.z)
}
object MainClass {
def main(args : Array[String]) =
{
println("Available CPU's: " + Runtime.getRuntime.availableProcessors())
println("Parallelism Degree set to: " + collection.parallel.ForkJoinTasks.defaultForkJoinPool.getParallelism);
// my position
val myPos = new Vector3D(0,0,0);
val r = new Random(0);
// define a function nextRand that gets us a random between 0 and 100
def nextRand = r.nextDouble() * 100;
// make 10 million random targets
val targets = (0 until 10000000).map(_ => new Vector3D(nextRand, nextRand, nextRand)).toArray
// take the .par hit before we start profiling
val parTargets = targets.par
println("Created " + targets.length + " vectors")
// define a range function
val rangeFunc : (Vector3D => Double) = (targetPos) => (targetPos - myPos).length
// we'll select ones that are <50
val within50 : (Vector3D => Boolean) = (targetPos) => rangeFunc(targetPos) < 50
// time it sequentially
val startTime_sequential = System.currentTimeMillis()
val numTargetsInRange_sequential = targets.filter(within50)
val endTime_sequential = System.currentTimeMillis()
println("Sequential (ms): " + (endTime_sequential - startTime_sequential))
// do the parallel version 10 times
for(i <- 1 to 10)
{
val startTime_par = System.currentTimeMillis()
val numTargetsInRange_parallel = parTargets.filter(within50)
val endTime_par = System.currentTimeMillis()
val ms = endTime_par - startTime_par;
println("Iteration[" + i + "] Executed in " + ms + " ms")
}
}
}
The output of this program is:
Available CPU's: 4
Parallelism Degree set to: 4
Created 10000000 vectors
Sequential (ms): 216
Iteration[1] Executed in 227 ms
Iteration[2] Executed in 253 ms
Iteration[3] Executed in 76 ms
Iteration[4] Executed in 78 ms
Iteration[5] Executed in 77 ms
Iteration[6] Executed in 80 ms
Iteration[7] Executed in 78 ms
Iteration[8] Executed in 78 ms
Iteration[9] Executed in 79 ms
Iteration[10] Executed in 82 ms
So what's going on here? The first 2 times we do the filter, it's slower, and then things speed up? I understand that there will inherently be a parallelism startup cost, I'm just trying to figure out where it makes sense to express the parallelism in my applicaion, and specifically I want to be able to show Management a program that runs 3-4 times faster on a Quad core box. Is this just not a good problem?
Ideas?
You have the micro-benchmark disease. You are most likely benchmarking the JIT compile phase. You'll need to warm up your JIT with a pre-run first.
Probably the best idea is to use a micro-benchmarking framework like http://code.google.com/p/caliper/ which handles all that for you.
Edit: There is a nice SBT Template for Caliper benchmarking Scala projects, as referenced from this blog post
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With