Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ActionBlock<T> vs Task.WhenAll

I would like to know what is the recommended way to execute multiple async methods in parallel?

in System.Threading.Tasks.Dataflow we can specify the max degree of parallelism but unbounded is probably the default for Task.WhenAll too ?

this :

var tasks = new List<Task>();
foreach(var item in items)
{
    tasks.Add(myAsyncMethod(item));
}
await Task.WhenAll(tasks.ToArray());

or that :

var action = new ActionBlock<string>(myAsyncMethod, new ExecutionDataflowBlockOptions
        {
            MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded,
            BoundedCapacity = DataflowBlockOptions.Unbounded,
            MaxMessagesPerTask = DataflowBlockOptions.Unbounded
        });
foreach (var item in items) { }
{
     action.Post(item);
}
action.Complete();

await action.Completion;
like image 558
fred_ Avatar asked May 16 '16 08:05

fred_


2 Answers

I would like to know what is the recommended way to execute multiple async methods in parallel?

Side note: actually not parallel, but concurrent.

in System.Threading.Tasks.Dataflow we can specify the max degree of parallelism but unbounded is probably the default for Task.WhenAll too ?

As someone commented, Task.WhenAll only joins existing tasks; by the time your code gets to Task.WhenAll, all the concurrency decsions have already been made.

You can throttle plain asynchronous code by using something like SemaphoreSlim.

The decision of whether to use asynchronous concurrency directly or TPL Dataflow is dependent on the surrounding code. If this concurrent operation is just called once asynchronously, then asynchronous concurrency is the best bet; but if this concurrent operation is part of a "pipeline" for your data, then TPL Dataflow may be a better fit.

like image 182
Stephen Cleary Avatar answered Oct 08 '22 20:10

Stephen Cleary


Both methods are acceptable and the choice should be governed by your requirements as you can see Dataflow gives you a lot of configurability that you would otherwise have to implement manually when using Tasks directly.

Note that in both situations the Task Pool will be responsible for enqueuing and running the tasks so the behaviour should remain the same.

Dataflow is good at chaining together groups of composable asynchronous operations whereas using tasks gives you finer grained control.

like image 41
Slugart Avatar answered Oct 08 '22 20:10

Slugart