Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel.ForEach questions

I am using a Parallel.ForEach loop in C# / VS2010 to do processing and I have a couple of questions.

First of all I have a process that needs to extract information from a remote webservice and then needs to build images (GDI) on the fly.

I have a class that encapsulates all of the functionality into a single object with two main methods Load() and CreateImage() with all the GDI management / WebRequests "blackboxed" inside this object.

I then create a GenericList that contains all the objects that need to be processed and I iterate through the list using the following code:

try
        {
            Parallel.ForEach(MyLGenericList, ParallelOptions, (MyObject, loopState) =>
            {                                       

                    MyObject.DoLoad();
                    MyObject.CreateImage();
                    MyObject.Dispose();

                if (loopState.ShouldExitCurrentIteration || loopState.IsExceptional)
                    loopState.Stop();
            });
        }
        catch (OperationCanceledException ex)
        {
            // Cancel here
        }
        catch (Exception ex)
        {
            throw ex;
        }

Now my questions are:

  1. Given that there could be ten thousand items in the list to parse, is the above code the best way to approach this? Any other ideas more then welcome
  2. I have an issue whereby when I start the process the objects are created / loaded and images created very fast but after around six hundred objects the process starts to crawl. It doesn eventually finish, is this normal?

Thanks in advance :) Adam

like image 292
user758136 Avatar asked May 17 '11 21:05

user758136


People also ask

Is parallel ForEach multiple threads?

Parallel. ForEach is like the foreach loop in C#, except the foreach loop runs on a single thread and processing take place sequentially, while the Parallel. ForEach loop runs on multiple threads and the processing takes place in a parallel manner.

Should you use parallel ForEach?

The short answer is no, you should not just use Parallel. ForEach or related constructs on each loop that you can. Parallel has some overhead, which is not justified in loops with few, fast iterations. Also, break is significantly more complex inside these loops.

Is parallel ForEach faster than ForEach?

The execution of Parallel. Foreach is faster than normal ForEach.

Is parallel ForEach blocking?

No, it doesn't block and returns control immediately. The items to run in parallel are done on background threads.


1 Answers

I am not sure that downloading data in parallel is a good idea since it will block a lot of threads. Split your task into a producer and a consumer instead. Then you can parallelize each of them separately.

Here is an example of a single producer and multiple consumers.
(If the consumers are faster than the producer you can just use a normal foreach instead of parallel.ForEach)

var sources = BlockingCollection<SourceData>();
var producer = Task.Factory.CreateNew(
    () => {
        foreach (var item in MyGenericList) {
            var data = webservice.FetchData(item);
            sources.Add(data)
        }
        sources.CompleteAdding();
    }
)
Parallel.ForEach(sources.GetConsumingPartitioner(),
                 data => {
                     imageCreator.CreateImage(data);
                 });

(the GetConsumingPartitioner extension is part of the ParallelExtensionsExtras)

Edit A more complete example

var sources = BlockingCollection<SourceData>();

var producerOptions = new ParallelOptions { MaxDegreeOfParallelism = 5 };
var consumerOptions = new ParallelOptions { MaxDegreeOfParallelism = -1 };

var producers = Task.Factory.CreateNew(
    () => {
        Parallel.ForEach(MyLGenericList, producerOptions, 
            myObject => {
                myObject.DoLoad()
                sources.Add(myObject)
            });
        sources.CompleteAdding();
    });
Parallel.ForEach(sources.GetConsumingPartitioner(), consumerOptions,
    myObject => {
        myObject.CreateImage();
        myObject.Dispose();
    });

With this code you can optimize the amount of parallel downloads while keeping the cpu busy with the image processing.

like image 71
adrianm Avatar answered Oct 12 '22 14:10

adrianm