What is the difference between the below code snippets? Won't both be using threadpool threads? For instance if I want to call a function for each item in a collection, <pre class="prettyprint"><code>Parallel.ForEach<Item>(items, item => DoSomething(item)); vs foreach(var item in items) { Task.Factory.StartNew(() => DoSomething(item)); } </code></pre>

The first is a much better option. Parallel.ForEach, internally, uses a <code>Partitioner<T></code> to distribute your collection into work items. It will not do one task per item, but rather batch this to lower the overhead involved. The second option will schedule a single <code>Task</code> per item in your collection. While the results will be (nearly) the same, this will introduce far more overhead than necessary, especially for large collections, and cause the overall runtimes to be slower. FYI - The Partitioner used can be controlled by using the appropriate overloads to Parallel.ForEach, if so desired. For details, see Custom Partitioners on MSDN. The main difference, at runtime, is the second will act asynchronous. This can be duplicated using Parallel.ForEach by doing: <pre class="prettyprint"><code>Task.Factory.StartNew( () => Parallel.ForEach<Item>(items, item => DoSomething(item))); </code></pre> By doing this, you still take advantage of the partitioners, but don't block until the operation is complete.

I did a small experiment of running a method "1,000,000,000 (one billion)" times with "Parallel.For" and one with "Task" objects. I measured the processor time and found Parallel more efficient. Parallel.For divides your task in to small work items and executes them on all the cores parallely in a optimal way. While creating lot of task objects ( FYI TPL will use thread pooling internally) will move every execution on each task creating more stress in the box which is evident from the experiment below. I have also created a small video which explains basic TPL and also demonstrated how Parallel.For utilizes your core more efficiently http://www.youtube.com/watch?v=No7QqSc5cl8 as compared to normal tasks and threads. Experiment 1 <pre class="prettyprint"><code>Parallel.For(0, 1000000000, x => Method1()); </code></pre> Experiment 2 <pre class="prettyprint"><code>for (int i = 0; i < 1000000000; i++) { Task o = new Task(Method1); o.Start(); } </code></pre> <img src="https://i.stack.imgur.com/Aa5Ir.png" alt="Processor time comparison">

In my view the most realistic scenario is when tasks have a heavy operation to complete. Shivprasad's approach focuses more on object creation/memory allocation than on computing itself. I made a research calling the following method: <pre class="prettyprint"><code>public static double SumRootN(int root) { double result = 0; for (int i = 1; i < 10000000; i++) { result += Math.Exp(Math.Log(i) / root); } return result; } </code></pre> Execution of this method takes about 0.5sec. I called it 200 times using Parallel: <pre class="prettyprint"><code>Parallel.For(0, 200, (int i) => { SumRootN(10); }); </code></pre> Then I called it 200 times using the old-fashioned way: <pre class="prettyprint"><code>List<Task> tasks = new List<Task>() ; for (int i = 0; i < loopCounter; i++) { Task t = new Task(() => SumRootN(10)); t.Start(); tasks.Add(t); } Task.WaitAll(tasks.ToArray()); </code></pre> First case completed in 26656ms, the second in 24478ms. I repeated it many times. Everytime the second approach is marginaly faster.

Parallel.ForEach vs Task.Factory.StartNew

Tags:

c#

c#-4.0

task-parallel-library

parallel-extensions

What is the difference between the below code snippets? Won't both be using threadpool threads?

For instance if I want to call a function for each item in a collection,

Parallel.ForEach<Item>(items, item => DoSomething(item));

vs

foreach(var item in items)
{
  Task.Factory.StartNew(() => DoSomething(item));
}

722

asked Feb 15 '11 20:02

stackoverflowuser

4 Answers

The first is a much better option.

Parallel.ForEach, internally, uses a Partitioner<T> to distribute your collection into work items. It will not do one task per item, but rather batch this to lower the overhead involved.

The second option will schedule a single Task per item in your collection. While the results will be (nearly) the same, this will introduce far more overhead than necessary, especially for large collections, and cause the overall runtimes to be slower.

FYI - The Partitioner used can be controlled by using the appropriate overloads to Parallel.ForEach, if so desired. For details, see Custom Partitioners on MSDN.

The main difference, at runtime, is the second will act asynchronous. This can be duplicated using Parallel.ForEach by doing:

Task.Factory.StartNew( () => Parallel.ForEach<Item>(items, item => DoSomething(item)));

By doing this, you still take advantage of the partitioners, but don't block until the operation is complete.

171

answered Sep 29 '22 21:09

Reed Copsey

I did a small experiment of running a method "1,000,000,000 (one billion)" times with "Parallel.For" and one with "Task" objects.

I measured the processor time and found Parallel more efficient. Parallel.For divides your task in to small work items and executes them on all the cores parallely in a optimal way. While creating lot of task objects ( FYI TPL will use thread pooling internally) will move every execution on each task creating more stress in the box which is evident from the experiment below.

I have also created a small video which explains basic TPL and also demonstrated how Parallel.For utilizes your core more efficiently http://www.youtube.com/watch?v=No7QqSc5cl8 as compared to normal tasks and threads.

Experiment 1

Parallel.For(0, 1000000000, x => Method1());

Experiment 2

for (int i = 0; i < 1000000000; i++)
{
    Task o = new Task(Method1);
    o.Start();
}

Processor time comparison

answered Sep 28 '22 21:09

Shivprasad Koirala

Parallel.ForEach will optimize(may not even start new threads) and block until the loop is finished, and Task.Factory will explicitly create a new task instance for each item, and return before they are finished (asynchronous tasks). Parallel.Foreach is much more efficient.

answered Sep 29 '22 21:09

Sogger

In my view the most realistic scenario is when tasks have a heavy operation to complete. Shivprasad's approach focuses more on object creation/memory allocation than on computing itself. I made a research calling the following method:

public static double SumRootN(int root)
{
    double result = 0;
    for (int i = 1; i < 10000000; i++)
        {
            result += Math.Exp(Math.Log(i) / root);
        }
        return result; 
}

Execution of this method takes about 0.5sec.

I called it 200 times using Parallel:

Parallel.For(0, 200, (int i) =>
{
    SumRootN(10);
});

Then I called it 200 times using the old-fashioned way:

List<Task> tasks = new List<Task>() ;
for (int i = 0; i < loopCounter; i++)
{
    Task t = new Task(() => SumRootN(10));
    t.Start();
    tasks.Add(t);
}

Task.WaitAll(tasks.ToArray());

First case completed in 26656ms, the second in 24478ms. I repeated it many times. Everytime the second approach is marginaly faster.

answered Sep 29 '22 21:09

user1089583

Related questions
                            
                                How to use ELMAH to manually log errors
                            
                                ASP.NET Core form POST results in a HTTP 415 Unsupported Media Type response
                            
                                Is there a .NET/C# wrapper for SQLite? [closed]
                            
                                Passing properties by reference in C#
                            
                                Dynamically adding properties to an ExpandoObject
                            
                                Why am I getting 'Assembly '*.dll' must be strong signed in order to be marked as a prerequisite.'?
                            
                                Array slices in C#
                            
                                What difference does .AsNoTracking() make?
                            
                                What is the difference between an int and an Integer in Java and C#?
                            
                                How do I access named capturing groups in a .NET Regex?
                            
                                Tree data structure in C#
                            
                                How do I decompile a .NET EXE into readable C# source code?
                            
                                Checking if an object is null in C#
                            
                                What really happens in a try { return x; } finally { x = null; } statement?
                            
                                ReSharper warns: "Static field in generic type"
                            
                                How to Convert JSON object to Custom C# object?
                            
                                Select distinct using linq [duplicate]
                            
                                How to decide between MonoTouch and Objective-C? [closed]
                            
                                How to return HTTP 500 from ASP.NET Core RC2 Web Api?
                            
                                Using IQueryable with Linq

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Parallel.ForEach vs Task.Factory.StartNew

Tags:

c#

c#-4.0

task-parallel-library

parallel-extensions

stackoverflowuser

People also ask

4 Answers

Reed Copsey

Shivprasad Koirala

Sogger

user1089583

Recent Activity

Donate For Us