Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel.ForEach Ordered Execution

I am trying to execute parallel functions on a list of objects using the new C# 4.0 Parallel.ForEach function. This is a very long maintenance process. I would like to make it execute in the order of the list so that I can stop and continue execution at the previous point. How do I do this?

Here is an example. I have a list of objects: a1 to a100. This is the current order:

a1, a51, a2, a52, a3, a53...

I want this order:

a1, a2, a3, a4...

I am OK with some objects being run out of order, but as long as I can find a point in the list where I can say that all objects before this point were run. I read the parallel programming csharp whitepaper and didn't see anything about it. There isn't a setting for this in the ParallelOptions class.

like image 496
Jeff Z Avatar asked Sep 03 '10 21:09

Jeff Z


People also ask

Does ForEach execute in parallel?

ForEach loop runs on multiple threads and the processing takes place in a parallel manner.

Does parallel ForEach wait for completion?

You don't have to do anything special, Parallel. Foreach() will wait until all its branched tasks are complete. From the calling thread you can treat it as a single synchronous statement and for instance wrap it inside a try/catch.

Is parallel ForEach good?

In many cases, Parallel. For and Parallel. ForEach can provide significant performance improvements over ordinary sequential loops. However, the work of parallelizing the loop introduces complexity that can lead to problems that, in sequential code, are not as common or are not encountered at all.

Does parallel ForEach use ThreadPool?

Parallel. ForEach uses managed thread pool to schedule parallel actions. The number of threads is set by ThreadPool.


3 Answers

If you use Parallel.Break to terminate the loop then you are guarenteed that all indices below the returned value will have been executed. This is about as close as you can get. The example here uses For but ForEach has similar overloads.

int n = ...
var result = new double[n];

var loopResult = Parallel.For(0, n, (i, loopState) =>
{
   if (/* break condition is true */)
   {
      loopState.Break();
      return;
   }
   result[i] = DoWork(i);
});

if (!loopResult.IsCompleted && 
        loopResult.LowestBreakIteration.HasValue)
{
   Console.WriteLine("Loop encountered a break at {0}", 
                      loopResult.LowestBreakIteration.Value);
}

In a ForEach loop, an iteration index is generated internally for each element in each partition. Execution takes place out of order but after break you know that all the iterations lower than LowestBreakIteration will have been completed.

Taken from "Parallel Programming with Microsoft .NET" http://parallelpatterns.codeplex.com/

Available on MSDN. See http://msdn.microsoft.com/en-us/library/ff963552.aspx. The section "Breaking out of loops early" covers this scenario.

See also: http://msdn.microsoft.com/en-us/library/dd460721.aspx

like image 102
Ade Miller Avatar answered Oct 02 '22 15:10

Ade Miller


Do something like this:

int current = 0;
object lockCurrent = new object();

Parallel.For(0, list.Count, 
             new ParallelOptions { MaxDegreeOfParallelism = MaxThreads },
             (ii, loopState) => {
                    // So the way Parallel.For works is that it chunks the task list up with each thread getting a chunk to work on...
                    // e.g. [1-1,000], [1,001- 2,000], [2,001-3,000] etc...
                    // We have prioritized our job queue such that more important tasks come first. So we don't want the task list to be
                    // broken up, we want the task list to be run in roughly the same order we started with. So we ignore tha past in 
                    // loop variable and just increment our own counter.
                    int thisCurrent = 0;
                    lock (lockCurrent) {
                        thisCurrent = current;
                        current++;
                    }
                    dothework(list[thisCurrent]);
                 });

You can see how when you break out of the parallel for loop you will know the last list item to be executed, assuming you let all threads finish prior to breaking. I'm not a big fan of PLINQ or LINQ. I honestly don't see how writing LINQ/PLINQ leads to maintainable source code or readability.... Parallel.For is a much better solution.

like image 37
James Krewson Avatar answered Oct 02 '22 15:10

James Krewson


For anyone else who comes across this question - if you're looping over an array or list (rather than an IEnumberable ), you can use the overload of Parallel.Foreach that gives the element index to maintain original order too.

string[] MyArray; // array of stuff to do parallel tasks on 
string[] ProcessedArray = new string[MyArray.Length];
Parallel.ForEach(MyArray, (ArrayItem,loopstate,ArrayElementIndex) =>
{
    string ProcessedArrayItem = TaskToDo(ArrayItem);
    ProcessedArray[ArrayElementIndex] = ProcessedArrayItem;
});
like image 41
Michael Low Avatar answered Oct 02 '22 14:10

Michael Low