Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Task.Factory.StartNew vs. Parallel.Invoke

In my application I execute from couple of dozens to couple of hundreds actions in parallel (no return value for the actions).

Which approach would be the most optimal:

  1. Using Task.Factory.StartNew in foreach loop iterating over Action array (Action[])

    Task.Factory.StartNew(() => someAction());

  2. Using Parallel class where actions is Action array (Action[])

    Parallel.Invoke(actions);

Are those two approaches equivalent? Are there any performance implications?

EDIT

I have performed some performance tests and on my machine (2 CPU 2 Cores each) results seems to be very similar. I am not sure how it is going to look like on other machines like 1 CPU. Also I am not sure (do not know how to test it very accurate way) what is memory consumption.

like image 562
Alexandar Avatar asked Jan 02 '13 23:01

Alexandar


People also ask

What is the difference between task run () and Taskfactory StartNew () methods?

Run(action) internally uses the default TaskScheduler , which means it always offloads a task to the thread pool. StartNew(action) , on the other hand, uses the scheduler of the current thread which may not use thread pool at all!

What is parallel invoke?

Invoke(Action[]) Executes each of the provided actions, possibly in parallel. Invoke(ParallelOptions, Action[]) Executes each of the provided actions, possibly in parallel, unless the operation is cancelled by the user.

Is task WhenAll parallel?

WhenAll has designed to handle concurrent I/O bound Tasks with higher scalability as it uses asynchronous non-blocking way to share threads to handle concurrent requests. But, on the other hand, Parallel itself is synchronous. So it is beneficial to use it in CPU bound logics to get better performance.


2 Answers

The most important difference between these two is that Parallel.Invoke will wait for all the actions to complete before continuing with the code, whereas StartNew will move on to the next line of code, allowing the tasks to complete in their own good time.

This semantic difference should be your first (and probably only) consideration. But for informational purposes, here's a benchmark:

/* This is a benchmarking template I use in LINQPad when I want to do a
 * quick performance test. Just give it a couple of actions to test and
 * it will give you a pretty good idea of how long they take compared
 * to one another. It's not perfect: You can expect a 3% error margin
 * under ideal circumstances. But if you're not going to improve
 * performance by more than 3%, you probably don't care anyway.*/
void Main()
{
    // Enter setup code here
    var actions2 =
    (from i in Enumerable.Range(1, 10000)
    select (Action)(() => {})).ToArray();

    var awaitList = new Task[actions2.Length];
    var actions = new[]
    {
        new TimedAction("Task.Factory.StartNew", () =>
        {
            // Enter code to test here
            int j = 0;
            foreach(var action in actions2)
            {
                awaitList[j++] = Task.Factory.StartNew(action);
            }
            Task.WaitAll(awaitList);
        }),
        new TimedAction("Parallel.Invoke", () =>
        {
            // Enter code to test here
            Parallel.Invoke(actions2);
        }),
    };
    const int TimesToRun = 100; // Tweak this as necessary
    TimeActions(TimesToRun, actions);
}


#region timer helper methods
// Define other methods and classes here
public void TimeActions(int iterations, params TimedAction[] actions)
{
    Stopwatch s = new Stopwatch();
    int length = actions.Length;
    var results = new ActionResult[actions.Length];
    // Perform the actions in their initial order.
    for(int i = 0; i < length; i++)
    {
        var action = actions[i];
        var result = results[i] = new ActionResult{Message = action.Message};
        // Do a dry run to get things ramped up/cached
        result.DryRun1 = s.Time(action.Action, 10);
        result.FullRun1 = s.Time(action.Action, iterations);
    }
    // Perform the actions in reverse order.
    for(int i = length - 1; i >= 0; i--)
    {
        var action = actions[i];
        var result = results[i];
        // Do a dry run to get things ramped up/cached
        result.DryRun2 = s.Time(action.Action, 10);
        result.FullRun2 = s.Time(action.Action, iterations);
    }
    results.Dump();
}

public class ActionResult
{
    public string Message {get;set;}
    public double DryRun1 {get;set;}
    public double DryRun2 {get;set;}
    public double FullRun1 {get;set;}
    public double FullRun2 {get;set;}
}

public class TimedAction
{
    public TimedAction(string message, Action action)
    {
        Message = message;
        Action = action;
    }
    public string Message {get;private set;}
    public Action Action {get;private set;}
}

public static class StopwatchExtensions
{
    public static double Time(this Stopwatch sw, Action action, int iterations)
    {
        sw.Restart();
        for (int i = 0; i < iterations; i++)
        {
            action();
        }
        sw.Stop();

        return sw.Elapsed.TotalMilliseconds;
    }
}
#endregion

Results:

Message               | DryRun1 | DryRun2 | FullRun1 | FullRun2
----------------------------------------------------------------
Task.Factory.StartNew | 43.0592 | 50.847  | 452.2637 | 463.2310
Parallel.Invoke       | 10.5717 |  9.948  | 102.7767 | 101.1158 

As you can see, using Parallel.Invoke can be roughly 4.5x faster than waiting for a bunch of newed-up tasks to complete. Of course, that's when your actions do absolutely nothing. The more each action does, the less of a difference you'll notice.

like image 180
StriplingWarrior Avatar answered Oct 10 '22 19:10

StriplingWarrior


In the grand scheme of things the performance differences between the two methods is negligible when considering the overhead of actually dealing with lots of tasks in any case.

The Parallel.Invoke basically performs the Task.Factory.StartNew() for you. So, I'd say readability is more important here.

Also, as StriplingWarrior mentions, the Parallel.Invoke performs a WaitAll (blocking the code until all the tasks are completed) for you, so you don't have to do that either. If you want to have the tasks run in the background without caring when they complete, then you want Task.Factory.StartNew().

like image 14
Colin Mackay Avatar answered Oct 10 '22 19:10

Colin Mackay