In my application I execute from couple of dozens to couple of hundreds actions in parallel (no return value for the actions).
Which approach would be the most optimal:
Using Task.Factory.StartNew
in foreach loop iterating over Action
array (Action[]
)
Task.Factory.StartNew(() => someAction());
Using Parallel
class where actions
is Action
array (Action[]
)
Parallel.Invoke(actions);
Are those two approaches equivalent? Are there any performance implications?
EDIT
I have performed some performance tests and on my machine (2 CPU 2 Cores each) results seems to be very similar. I am not sure how it is going to look like on other machines like 1 CPU. Also I am not sure (do not know how to test it very accurate way) what is memory consumption.
Run(action) internally uses the default TaskScheduler , which means it always offloads a task to the thread pool. StartNew(action) , on the other hand, uses the scheduler of the current thread which may not use thread pool at all!
Invoke(Action[]) Executes each of the provided actions, possibly in parallel. Invoke(ParallelOptions, Action[]) Executes each of the provided actions, possibly in parallel, unless the operation is cancelled by the user.
WhenAll has designed to handle concurrent I/O bound Tasks with higher scalability as it uses asynchronous non-blocking way to share threads to handle concurrent requests. But, on the other hand, Parallel itself is synchronous. So it is beneficial to use it in CPU bound logics to get better performance.
The most important difference between these two is that Parallel.Invoke
will wait for all the actions to complete before continuing with the code, whereas StartNew
will move on to the next line of code, allowing the tasks to complete in their own good time.
This semantic difference should be your first (and probably only) consideration. But for informational purposes, here's a benchmark:
/* This is a benchmarking template I use in LINQPad when I want to do a
* quick performance test. Just give it a couple of actions to test and
* it will give you a pretty good idea of how long they take compared
* to one another. It's not perfect: You can expect a 3% error margin
* under ideal circumstances. But if you're not going to improve
* performance by more than 3%, you probably don't care anyway.*/
void Main()
{
// Enter setup code here
var actions2 =
(from i in Enumerable.Range(1, 10000)
select (Action)(() => {})).ToArray();
var awaitList = new Task[actions2.Length];
var actions = new[]
{
new TimedAction("Task.Factory.StartNew", () =>
{
// Enter code to test here
int j = 0;
foreach(var action in actions2)
{
awaitList[j++] = Task.Factory.StartNew(action);
}
Task.WaitAll(awaitList);
}),
new TimedAction("Parallel.Invoke", () =>
{
// Enter code to test here
Parallel.Invoke(actions2);
}),
};
const int TimesToRun = 100; // Tweak this as necessary
TimeActions(TimesToRun, actions);
}
#region timer helper methods
// Define other methods and classes here
public void TimeActions(int iterations, params TimedAction[] actions)
{
Stopwatch s = new Stopwatch();
int length = actions.Length;
var results = new ActionResult[actions.Length];
// Perform the actions in their initial order.
for(int i = 0; i < length; i++)
{
var action = actions[i];
var result = results[i] = new ActionResult{Message = action.Message};
// Do a dry run to get things ramped up/cached
result.DryRun1 = s.Time(action.Action, 10);
result.FullRun1 = s.Time(action.Action, iterations);
}
// Perform the actions in reverse order.
for(int i = length - 1; i >= 0; i--)
{
var action = actions[i];
var result = results[i];
// Do a dry run to get things ramped up/cached
result.DryRun2 = s.Time(action.Action, 10);
result.FullRun2 = s.Time(action.Action, iterations);
}
results.Dump();
}
public class ActionResult
{
public string Message {get;set;}
public double DryRun1 {get;set;}
public double DryRun2 {get;set;}
public double FullRun1 {get;set;}
public double FullRun2 {get;set;}
}
public class TimedAction
{
public TimedAction(string message, Action action)
{
Message = message;
Action = action;
}
public string Message {get;private set;}
public Action Action {get;private set;}
}
public static class StopwatchExtensions
{
public static double Time(this Stopwatch sw, Action action, int iterations)
{
sw.Restart();
for (int i = 0; i < iterations; i++)
{
action();
}
sw.Stop();
return sw.Elapsed.TotalMilliseconds;
}
}
#endregion
Results:
Message | DryRun1 | DryRun2 | FullRun1 | FullRun2
----------------------------------------------------------------
Task.Factory.StartNew | 43.0592 | 50.847 | 452.2637 | 463.2310
Parallel.Invoke | 10.5717 | 9.948 | 102.7767 | 101.1158
As you can see, using Parallel.Invoke can be roughly 4.5x faster than waiting for a bunch of newed-up tasks to complete. Of course, that's when your actions do absolutely nothing. The more each action does, the less of a difference you'll notice.
In the grand scheme of things the performance differences between the two methods is negligible when considering the overhead of actually dealing with lots of tasks in any case.
The Parallel.Invoke
basically performs the Task.Factory.StartNew()
for you. So, I'd say readability is more important here.
Also, as StriplingWarrior mentions, the Parallel.Invoke
performs a WaitAll
(blocking the code until all the tasks are completed) for you, so you don't have to do that either. If you want to have the tasks run in the background without caring when they complete, then you want Task.Factory.StartNew()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With