Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using 'AsParallel()' / 'Parallel.ForEach()' guidelines?

Looking for a little advice on leveraging AsParallel() or Parallel.ForEach() to speed this up.

See the method I've got (simplified/bastardized for this example) below.

It takes a list like "US, FR, APAC", where "APAC" is an alias for maybe 50 other "US, FR, JP, IT, GB" etc. countires. The method should take "US, FR, APAC", and convert it to a list of "US", "FR", plus all the countries that are in "APAC".

private IEnumerable<string> Countries (string[] countriesAndAliases) {     var countries = new List<string>();      foreach (var countryOrAlias in countriesAndAliases)     {         if (IsCountryNotAlias(countryOrAlias))         {             countries.Add(countryOrAlias);         }         else          {             foreach (var aliasCountry in AliasCountryLists[countryOrAlias])              {                 countries.Add(aliasCountry);             }         }     }      return countries.Distinct(); } 

Is making this parallelized as simple as changing it to what's below? Is there more nuance to using AsParallel() than this? Should I be using Parallel.ForEach() instead of foreach? What rules of thumb should I use when parallelizing foreach loops?

private IEnumerable<string> Countries (string[] countriesAndAliases) {     var countries = new List<string>();      foreach (var countryOrAlias in countriesAndAliases.AsParallel())     {         if (IsCountryNotAlias(countryOrAlias))         {             countries.Add(countryOrAlias);         }         else          {             foreach (var aliasCountry in AliasCountryLists[countryOrAlias].AsParallel())              {                 countries.Add(aliasCountry);             }         }     }      return countries.Distinct(); } 
like image 999
SnickersAreMyFave Avatar asked Sep 23 '10 17:09

SnickersAreMyFave


People also ask

When should I use parallel ForEach When should I use Plinq?

use the Parallel. ForEach method for the simplest use case, where you just need to perform an action for each item in the collection. use the PLINQ methods when you need to do more, e.g. query the collection or to stream the data.

Does parallel ForEach wait for completion?

You don't have to do anything special, Parallel. Foreach() will wait until all its branched tasks are complete. From the calling thread you can treat it as a single synchronous statement and for instance wrap it inside a try/catch.

How does Parallel ForEach work?

The Parallel. ForEach method splits the work to be done into multiple tasks, one for each item in the collection. Parallel. ForEach is like the foreach loop in C#, except the foreach loop runs on a single thread and processing take place sequentially, while the Parallel.

What is AsParallel in Linq C#?

AsParallel(IEnumerable)Enables parallelization of a query. public: [System::Runtime::CompilerServices::Extension] static System::Linq::ParallelQuery ^ AsParallel(System::Collections::IEnumerable ^ source); C# Copy.


2 Answers

Several points.

writing just countriesAndAliases.AsParallel() is useless. AsParallel() makes part of Linq query that comes after it execute in parallel. Part is empty, so no use at all.

generally you should repace foreach with Parallel.ForEach(). But beware of not thread safe code! You have it. You can't just wrap it into foreach because List<T>.Add is not thread safe itself.

so you should do like this (sorry, i didn't test, but it compiles):

        return countriesAndAliases             .AsParallel()             .SelectMany(s =>                  IsCountryNotAlias(s)                     ? Enumerable.Repeat(s,1)                     : AliasCountryLists[s]                 ).Distinct(); 

Edit:

You must be sure about two more things:

  1. IsCountryNotAlias must be thread safe. It would be even better if it is pure function.
  2. No one will modify AliasCountryLists in a meanwhile, because dictionaries are not thread safe. Or use ConcurrentDictionary to be sure.

Useful links that will help you:

Patterns for Parallel Programming: Understanding and Applying Parallel Patterns with the .NET Framework 4

Parallel Programming in .NET 4 Coding Guidelines

When Should I Use Parallel.ForEach? When Should I Use PLINQ?

PS: As you see new parallel features are not as obvious as they look (and feel).

like image 100
Andrey Avatar answered Oct 07 '22 20:10

Andrey


When using AsParallel(), you need to make sure that your body is thread safe. Unfortunately, the above code will not work. List<T> is not thread safe, so your addition of AsParallel() will cause a race condition.

If, however, you switch your collections to using a collection in System.Collections.Concurrent, such as ConcurrentBag<T>, the above code will most likely work.

like image 32
Reed Copsey Avatar answered Oct 07 '22 20:10

Reed Copsey