Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Awaiting lots of tasks

I have a set of Task (lots of them, around 400):

IEnumerable<Task> tasks = ...

I want to run them all at the same time and then wait for each one of them. I use this piece of code to run the tasks:

Task.Run(async () => { ... });

Each of the task will run asynchronous methods themselves and that is why I need the async keyword in the lambda. Among these nested tasks there is notoriously HTTP requests that are sent and HTTP responses that are received.

I tried two different ways to wait for all the tasks to complete:

await Task.WhenAll(tasks);

and

foreach (var task in tasks)
{
    await task;
}

Which, a priori, look exactly the same to me (but of course they don't seem to be otherwise I would not post here in the first place...).

The first way makes the tasks run faster but there are tons of A first chance exception of type 'System.Net.Sockets.SocketException' occurred in System.dll and others like this in the output window. Moreover, some tasks are still in the WaitingForActivation state after the call to await Task.WhenAll().

The second way is slower and it looks like the tasks are not running simultaneously (I receive the HTTP responses one by one, while the first way of waiting the tasks make them coming almost all at the same time). Also, I see no first chance exception at all in the output window when I use the foreach loop to wait for each task and no task has the WaitingForActivation state after the loop.

I understand the "best" way to wait for a set of task is to use WhenAll() (at least for readability) but why these two methods behave differently? How can I overcome this issue? Ideally I would want the tasks to run fast and be sure that everything ended (I have a try catch finally block in the lambda to handle server error and I did not forget the if(httpClient != null) httpClient.Dispose() in the finally before anyone asks...).

Any hints are welcome!

EDIT:

Okay I tried another thing. I added:

.ContinueWith(x => System.Diagnostics.Debug.WriteLine("#### ENDED = " + index)));

To each task, index being the number of the Task. When using the foreach loop, I get :

#### ENDED = 0
#### ENDED = 1
#### ENDED = 2
#### ENDED = 3
#### ENDED = 4
...

When using the WhenAll(), I get :

#### ENDED = 1
#### ENDED = 3
#### ENDED = 0
#### ENDED = 4
#### ENDED = 8
...

So using the foreach loop make all my tasks run synchronously... which maybe explains why I don't get any First Chance Exception in the output window since the system is not stressed by the algorithm at all.

EDIT2:

Sample code : http://pastebin.com/5bMWicD4

It uses a public service available here : http://timezonedb.com/

like image 351
Max Avatar asked Jan 12 '23 23:01

Max


2 Answers

The two attempts are totally different.

The first attempt awaits for all tasks to complete and continues afterwards. It will throw only after all the tasks have completed. The order of the results is indeterminate and will depend on which task finishes first.

The second waits for each task one by one, in the order they were placed in the tasks array, which of course is not what you want and is rather slow. It will abort waiting with an exception if even one task fails. The results of the other tasks will be lost.

It's not EXACTLY like running the tasks synchronously, since some tasks will finish earlier than others, but you still have to check them all one at a time.

You should note here that Task.WhenAll doesn't block by itself. It returns a Task that finishes when all other tasks have finished. By calling await Task.WhenAll you await on that task's completion. You could check that task's status to see whether one or more subtasks failed or were cancelled, or process the results with a call to ContinueWith.

You could also call Task.WaitAll instead of await Task.WhenAll to block until all the tasks finish, or at least one of them cancels or aborts. This is somewhat similar to your second attempt, although it still avoids waiting on all tasks one by one.

The fact that you have a lot of exceptions has nothing to do with the way you await. There are limits to how many HTTP connections you can make to the same domain (ie address) at a time, there may be timeout errors (usually caused by the connection limit) or other network related problems.

The kind of exceptions you receive though, is affected by whether you call await Task.WhenAll or Task.WaitAll. This post explains the issue, but in short, Task.WaitAll will collect all exceptions and throw an AggregateException while await Task.WhenAll will only return one of them.

By the way, what is the message you receive for the SocketException?

like image 105
Panagiotis Kanavos Avatar answered Jan 20 '23 15:01

Panagiotis Kanavos


The behavior of your code has nothing to do with await. It is caused by the way you iterate the collection of Tasks. Most LINQ methods are lazy, which means they actually execute their code only when you iterate them.

So, this code starts each Task only after the previous one completed:

foreach (var task in tasks)
{
    await task;
}

but this code starts all of them at once:

foreach (var task in tasks.ToList())
{
    await task;
}

And since Task.WhenAll() does the equivalent of ToList() internally, you're going to get the same behavior as the second snippet above.

like image 22
svick Avatar answered Jan 20 '23 13:01

svick