I am new to C#, Parallel.ForEach
, and .NET in general. I want to parallelize a search that involves thousands of locations. For each location, I compute great circle distance. That is a calculation I want to spread to different cores. My question is how do I do this if I only have one thread-local variable, as in this MSDN TPL example? For the result, I looked at Interlocked
, and saw its options Add
, CompareExchange
, Decrement
, Exchange
, Increment
and Read
, but I'm not just adding, incrementing, decrementing, or testing for equality. I want to return the object, over several threads running in parallel, that has the shortest overall distance. My gut says this should be easy, that I should be able to create some little object that wraps a Location
and a distance, but how do I capture the best answer from each thread and then choose the shortest distance among them? Here is the non-parallel version:
Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
double closest = double.MaxValue;
Location closestLoc = null;
foreach (Location aLoc in allLocations)
{
if (aLoc != myLocation)
{
double d = greatCircle(myLocation, aLoc);
if (d < closest)
{
closest = d;
closestLoc = aLoc;
}
}
}
return closestLoc;
}
I did see a DDJ Blog Post that seemed to offer good advice, but I wondered if it was the best advice. I see the author looping over arrays, and wonder if there isn't a more functional way of doing this. In the functional world I would use map
, lambda
and min
.
The easiest option here would be to switch to PLINQ:
Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
return allLocations
.AsParallel()
.Min(location => greatCircle(myLocation, location));
}
That being said, this is basically just aggregation with parallel constructs. You have a couple of options if you want to stick to the Parallel class. One option would be to synchronize this yourself within the block, using locking. I wouldn't recommend this, as it will hurt your overall performance.
The better option is to use the Parallel.ForEach methods which provide for local state. They would allow you to rewrite this as:
Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
double closest = double.MaxValue;
Location closestLoc = null;
object sync = new object();
Parallel.ForEach<Location, Tuple<double,Location>(
allLocations,
() => new Tuple(double.MaxValue, null),
(location, loopState, localState) =>
{
double d = greatCircle(myLocation, aLoc);
if (d < localState.Item1)
return new Tuple(d, aLoc);
else
return localState;
},
localState =>
{
lock(sync)
{
if (localState.Item1 < closest)
{
closest = localState.Item1;
closestLoc = localState.Item2;
}
}
}
);
return closestLoc;
}
I cover using local state for aggregations in detail on my blog. This basically changes the operation to one lock operation per thread instead of one lock per processing element, so you get much higher throughput than a naive locking solution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With