Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel.ForEach Slower than ForEach

Here is the code:

using (var context = new AventureWorksDataContext()) {     IEnumerable<Customer> _customerQuery = from c in context.Customers                                            where c.FirstName.StartsWith("A")                                            select c;      var watch = new Stopwatch();     watch.Start();      var result = Parallel.ForEach(_customerQuery, c => Console.WriteLine(c.FirstName));      watch.Stop();     Debug.WriteLine(watch.ElapsedMilliseconds);      watch = new Stopwatch();     watch.Start();      foreach (var customer in _customerQuery)     {         Console.WriteLine(customer.FirstName);     }      watch.Stop();     Debug.WriteLine(watch.ElapsedMilliseconds); } 

The problem is, Parallel.ForEach takes about 400ms vs a regular foreach, which takes about 40ms. What exactly am I doing wrong and why doesn't this work as I expect it to?

like image 843
Mike Diaz Avatar asked May 17 '11 19:05

Mike Diaz


People also ask

Is parallel ForEach faster than ForEach?

The execution of Parallel. Foreach is faster than normal ForEach.

Why is parallel ForEach slower?

Since the work in your parallel function is very small, the overhead of the management the parallelism has to do becomes significant, thus slowing down the overall work.

Should you use parallel ForEach?

The short answer is no, you should not just use Parallel. ForEach or related constructs on each loop that you can. Parallel has some overhead, which is not justified in loops with few, fast iterations. Also, break is significantly more complex inside these loops.

Does parallel ForEach wait for completion?

You don't have to do anything special, Parallel. Foreach() will wait until all its branched tasks are complete. From the calling thread you can treat it as a single synchronous statement and for instance wrap it inside a try/catch.


1 Answers

Suppose you have a task to perform. Let's say you're a math teacher and you have twenty papers to grade. It takes you two minutes to grade a paper, so it's going to take you about forty minutes.

Now let's suppose that you decide to hire some assistants to help you grade papers. It takes you an hour to locate four assistants. You each take four papers and you are all done in eight minutes. You've traded 40 minutes of work for 68 total minutes of work including the extra hour to find the assistants, so this isn't a savings. The overhead of finding the assistants is larger than the cost of doing the work yourself.

Now suppose you have twenty thousand papers to grade, so it is going to take you about 40000 minutes. Now if you spend an hour finding assistants, that's a win. You each take 4000 papers and are done in a total of 8060 minutes instead of 40000 minutes, a savings of almost a factor of 5. The overhead of finding the assistants is basically irrelevant.

Parallelization is not free. The cost of splitting up work amongst different threads needs to be tiny compared to the amount of work done per thread.

Further reading:

Amdahl's law

Gives the theoretical speedup in latency of the execution of a task at fixed workload, that can be expected of a system whose resources are improved.

Gustafson's law

Gives the theoretical speedup in latency of the execution of a task at fixed execution time, that can be expected of a system whose resources are improved.

like image 183
Eric Lippert Avatar answered Oct 12 '22 22:10

Eric Lippert