Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Choosing minimum among minima using Parallel.ForEach

I am new to C#, Parallel.ForEach, and .NET in general. I want to parallelize a search that involves thousands of locations. For each location, I compute great circle distance. That is a calculation I want to spread to different cores. My question is how do I do this if I only have one thread-local variable, as in this MSDN TPL example? For the result, I looked at Interlocked, and saw its options Add, CompareExchange, Decrement, Exchange, Increment and Read, but I'm not just adding, incrementing, decrementing, or testing for equality. I want to return the object, over several threads running in parallel, that has the shortest overall distance. My gut says this should be easy, that I should be able to create some little object that wraps a Location and a distance, but how do I capture the best answer from each thread and then choose the shortest distance among them? Here is the non-parallel version:

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
  double closest = double.MaxValue;
  Location closestLoc = null;
  foreach (Location aLoc in allLocations)
  {
    if (aLoc != myLocation)
    {
      double d = greatCircle(myLocation, aLoc);
      if (d < closest)
      {
        closest = d;
        closestLoc = aLoc;
      }
    }
  }
  return closestLoc;
}

I did see a DDJ Blog Post that seemed to offer good advice, but I wondered if it was the best advice. I see the author looping over arrays, and wonder if there isn't a more functional way of doing this. In the functional world I would use map, lambda and min.

like image 370
gknauth Avatar asked Jul 23 '10 22:07

gknauth


1 Answers

The easiest option here would be to switch to PLINQ:

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
     return allLocations
               .AsParallel()
               .Min(location => greatCircle(myLocation, location));
}

That being said, this is basically just aggregation with parallel constructs. You have a couple of options if you want to stick to the Parallel class. One option would be to synchronize this yourself within the block, using locking. I wouldn't recommend this, as it will hurt your overall performance.

The better option is to use the Parallel.ForEach methods which provide for local state. They would allow you to rewrite this as:

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
  double closest = double.MaxValue;
  Location closestLoc = null;
  object sync = new object();

  Parallel.ForEach<Location, Tuple<double,Location>(
      allLocations,
      () => new Tuple(double.MaxValue, null),
      (location, loopState, localState) =>
      {
          double d = greatCircle(myLocation, aLoc);
          if (d < localState.Item1)
              return new Tuple(d, aLoc);
          else
              return localState;
      },
      localState =>
      {
          lock(sync)
          {
              if (localState.Item1 < closest)
              {
                  closest = localState.Item1;
                  closestLoc = localState.Item2;
              }
          }
      }
  );
  return closestLoc;
}

I cover using local state for aggregations in detail on my blog. This basically changes the operation to one lock operation per thread instead of one lock per processing element, so you get much higher throughput than a naive locking solution.

like image 133
Reed Copsey Avatar answered Nov 14 '22 21:11

Reed Copsey