Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will Parallel.ForEach process in order with MaxDegreeOfParallelism=1?

Is Parallel.ForEach() with MaxDegreeOfParallelism==1 guaranteed to process the input enumerable in-order?

If the answer is "no", is there a way to enforce this behavior?

like image 642
D.R. Avatar asked Mar 15 '17 13:03

D.R.


People also ask

Does ForEach execute in parallel?

ForEach will always execute in parallel.

Does parallel ForEach use ThreadPool?

Parallel. ForEach uses managed thread pool to schedule parallel actions. The number of threads is set by ThreadPool.

Is parallel ForEach blocking?

No, it doesn't block and returns control immediately. The items to run in parallel are done on background threads.

Does parallel ForEach wait?

You don't have to do anything special, Parallel. Foreach() will wait until all its branched tasks are complete. From the calling thread you can treat it as a single synchronous statement and for instance wrap it inside a try/catch.


2 Answers

First, it is correct that Microsoft's official documentation on parallel programming states that the execution order is not guaranteed.

The Parallel.ForEach method does not guarantee the order of execution. Unlike a sequential ForEach loop, the incoming values aren't always processed in order.

It would be best to use Parallel.ForEach as the public API is designed: to process items in a parallel manner. If you need to process items sequentially, you're much better off using a regular foreach loop. The intent is clearer than using MaxDegreeOfParallelism = 1.

With that being said, for curiosity's sake, I took a look at the source code for .NET 4.7.1. The short answer is yes, the items will be processed sequentially if MaxDegreeOfParallelism = 1. However, you shouldn't rely on this for future implementations, because it may not always be this way.

  1. Taking a look at Parallel.ForEach and following it through, you'll eventually see that the collection to be iterated over is partitioned (this process is slightly different whether it is a TSource[], List<TSource>, or an IEnumerable<TSource>.

  2. Task.SavedStateForNextReplica and Task.SavedStateFromPreviousReplica are overridden in ParallelForReplicaTask in order to communicate state between tasks running in parallel. In this case, they are used to communicate which partition the task should iterate over.

  3. Finally, let's take a look at Task.ExecuteSelfReplicating. ParallelForReplicatingTask overrides ShouldReplicate based on the degree of parallelism specified as well as the task scheduler's MaximumConcurrencyLevel. So, this with MaxDegreeOfParallelism = 1 will only create a single child task. As such, this task will only operate over the single partition which was created.

So, to answer your question: as of writing, Parallel.ForEach with MaxDegreeOfParallism = 1 will enumerate the collection from beginning to end for a TSource[], from beginning to end for an IList<TSource>, and use GetEnumerator for an IEnumerable<TSource>, with slightly different paths depending on if the IEnumerable<TSource> can be cast to an OrderablePartitioner<TSource> or not. These three paths are determined in Parallel.ForEachWorker.

I strongly encourage you to browse through the source code on your own to see for yourself.

I hope this is able to answer your question, but it's really important to remember: don't rely on this. It is very possible that this implementation can change in the future.

like image 200
johnnyRose Avatar answered Oct 20 '22 10:10

johnnyRose


From MSDN:

The Parallel.ForEach method does not guarantee the order of execution. Unlike a sequential ForEach loop, the incoming values aren't always processed in order.

https://msdn.microsoft.com/library/ff963552.aspx

like image 39
Sylence Avatar answered Oct 20 '22 09:10

Sylence