I currently have a for loop to compare 2 arrays and determine if they're equal.
public override bool Equals(object obj)
{
RushHourPathLengthNode otherNode = (RushHourPathLengthNode)obj;
// Compare their carCoords and return false as soon as we find a difference
for (int i = 0, l = carCoords.Length; i < l; ++i)
if (carCoords[i].x != otherNode.carCoords[i].x || carCoords[i].y != otherNode.carCoords[i].y)
return false;
return true;
}
This works well but it's one of the slowest parts of my progam. This way my testcase takes about 7 seconds to compute.
Although i may have 50K tasks running, the CPU usage was about 50% on my i7 860 CPU (4 cores, 8 threads).
My idea was to use a parallel for loop to maximize CPU usage and make this faster. This is what i came up with.
public override bool Equals(object obj)
{
RushHourPathLengthNode otherNode = (RushHourPathLengthNode)obj;
bool result = true;
Parallel.For(0, carCoords.Length, (i, loopState) =>{
if (!result)
loopState.Stop();
if (carCoords[i].x != otherNode.carCoords[i].x || carCoords[i].y != otherNode.carCoords[i].y)
result = false;
});
return result;
}
To me this looks like it will try to run parallel and also stop working on it as soon as a difference has been found because of loopState.Stop. This way the CPU usage is 90%+ but my testcase takes about 35 seconds to compute and I do not understand why.
Is there something wrong with my implementation or is my entire approach wrong?
EDIT : carCoords.Length will be a value between 2 and +-100. It sounds like that value is too low to justify doing this in parallel.
First, if your for loop only does a few 100 iterations, don't bother parallelizing it. With the work you're doing in each iteration, you need thousands of iterations to make it worthwhile.
Assuming you have many iterations, the main reason for the huge slow down is that you are only doing a tiny bit of work in each iteration. There is a lot of overhead in scheduling the tasks, calling the method body, etc. and that dominates the overall execution time.
You can get around this by partitioning the range into segments using Partinioner.Create
(from the System.Collections.Concurrent
namespace). This gives you a sequence of tuples containing start and end indexes for each range. You can then let each task iterate over a range, which is a lot more efficient.
Secondly, since local variables are captured in a closure object, it is sometimes faster to use a local copy of the variable inside the method body.
So this is what we get:
Parallel.ForEach(Partitioner.Create(0, carCoords.Length), (r, loopState) => {
var c1 = carCoords;
var c2 = otherNode.carCoords;
int end = r.Item2;
for (int i = r.Item1; i < end; ++i) {
if (loopState.IsStopped)
return;
if (c1[i].x != c2[i].x || c1[i].y != c2[i].y) {
loopState.Stop();
return;
}
}
});
I'm not sure it's worth it to check IsStopped
on every iteration. It may be more efficient to have more partitions (play with the Partitioner.Create
overloads) and put the check in before the for loop.
Implement IEqutable<RushHourPathLengthNode>
interface on your class to avoid casting. It will win you seconds.
If carCoords.Length is small, then parallel loop will take more time, since managing threads and switching is costly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With