Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel.Foreach as fast / slow as normal ForEach

UPDATE: I used Threading to split up the Loop in the Amount of Kernels (8 in my Case) and the complete Loop went through in under 1 second. So the problem is not, that the Operation is not faster with threading. Why did Parralel Extension fail in this case?

Hey everyone. I want to convert my ForEach with Parrallel.Foreach. The problem is, that the parralelisation brings hardly any advantage for me.

Original:

foreach (Entities.Buchung buchung in buchungen) {
    Int32 categoryID = manager.GetCategoryID(new Regelengine.Booking(buchung)); // Average 4ms
    buchung.Category = categoryID.ToString();
}

Parallel:

System.Threading.Tasks.Parallel.ForEach(buchungen, buchung => {
    Int32 categoryID = manager.GetCategoryID(new Regelengine.Booking(buchung));
    buchung.Category = categoryID.ToString();
});

Results:

---------------------------
Stopwatched Results for 1550 entries in the List:
---------------------------
Parallel.Foreach 00:00:07.6599066
Average Foreach: 00:00:07.9791303

Maybe the problem is, that the actual action in the loop is so short? But nobody can tell me, that parallelising 1550 operations on an Intel I7 won't save any time.

like image 799
Steav Avatar asked Feb 04 '11 15:02

Steav


People also ask

Is parallel ForEach faster than ForEach?

The execution of Parallel. Foreach is faster than normal ForEach.

Why is parallel ForEach slower?

Since the work in your parallel function is very small, the overhead of the management the parallelism has to do becomes significant, thus slowing down the overall work.

Does parallel ForEach improve performance?

In this article In many cases, Parallel. For and Parallel. ForEach can provide significant performance improvements over ordinary sequential loops. However, the work of parallelizing the loop introduces complexity that can lead to problems that, in sequential code, are not as common or are not encountered at all.

Should you use parallel ForEach?

The short answer is no, you should not just use Parallel. ForEach or related constructs on each loop that you can. Parallel has some overhead, which is not justified in loops with few, fast iterations.


2 Answers

There is only one resource you can take advantage of by using Parallel.For: CPU cycles. When you have N cores then you can theoretically speed up your code by a factor of N. What is however required is that it is actually CPU cycles that is the constraint in your code. Which is not often the case unless you execute computationally expensive code. Other constraints are the speed of the hard disk, the network connection, a dbase server, in select cases the bandwidth of the memory bus. You've only got one of those, Parallel.For cannot magically give you another disk.

Testing whether Parallel.For will speed up your code is pretty simple. Just run the code without parallelizing and observe the CPU load in Taskmgr.exe or Perfmon. If one core isn't running at 100% then your code is not compute bound. If it is running at, say, 10% then you can only ever hope to make it take 90% of the time no matter how many cores you have. Which you'll get by overlapping I/O wait time with processing time, two threads will get that done.

like image 63
Hans Passant Avatar answered Sep 27 '22 22:09

Hans Passant


Questions that you should consider in this are:

  • What is the overhead of spinning up a thread?
  • What is the overhead of my thread safety (locks)?
  • Where are the actual bottlenecks and will multithreading really help?

The last is your biggest consideration here. For example, if you are maxing your i/o channel, all the threads in the world won't do squat. So is your task CPU bound or I/O bound?

like image 36
plinth Avatar answered Sep 27 '22 20:09

plinth