Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Async await in linq select

People also ask

Are LINQ queries async?

Scalar value returning methods The difference with the IEnumerable returning methods is that the method which defines what value to return is also the async method. This means that the async method is part of the linq query.

What is an async task C#?

An async method runs synchronously until it reaches its first await expression, at which point the method is suspended until the awaited task is complete. In the meantime, control returns to the caller of the method, as the example in the next section shows.


var inputs = events.Select(async ev => await ProcessEventAsync(ev))
                   .Select(t => t.Result)
                   .Where(i => i != null)
                   .ToList();

But this seems very weird to me, first of all the use of async and await in the select. According to this answer by Stephen Cleary I should be able to drop those.

The call to Select is valid. These two lines are essentially identical:

events.Select(async ev => await ProcessEventAsync(ev))
events.Select(ev => ProcessEventAsync(ev))

(There's a minor difference regarding how a synchronous exception would be thrown from ProcessEventAsync, but in the context of this code it doesn't matter at all.)

Then the second Select which selects the result. Doesn't this mean the task isn't async at all and is performed synchronously (so much effort for nothing), or will the task be performed asynchronously and when it's done the rest of the query is executed?

It means that the query is blocking. So it is not really asynchronous.

Breaking it down:

var inputs = events.Select(async ev => await ProcessEventAsync(ev))

will first start an asynchronous operation for each event. Then this line:

                   .Select(t => t.Result)

will wait for those operations to complete one at a time (first it waits for the first event's operation, then the next, then the next, etc).

This is the part I don't care for, because it blocks and also would wrap any exceptions in AggregateException.

and is it completely the same like this?

var tasks = await Task.WhenAll(events.Select(ev => ProcessEventAsync(ev)));
var inputs = tasks.Where(result => result != null).ToList();

var inputs = (await Task.WhenAll(events.Select(ev => ProcessEventAsync(ev))))
                                       .Where(result => result != null).ToList();

Yes, those two examples are equivalent. They both start all asynchronous operations (events.Select(...)), then asynchronously wait for all the operations to complete in any order (await Task.WhenAll(...)), then proceed with the rest of the work (Where...).

Both of these examples are different from the original code. The original code is blocking and will wrap exceptions in AggregateException.


Existing code is working, but is blocking the thread.

.Select(async ev => await ProcessEventAsync(ev))

creates a new Task for every event, but

.Select(t => t.Result)

blocks the thread waiting for each new task to end.

In the other hand your code produce the same result but keeps asynchronous.

Just one comment on your first code. This line

var tasks = await Task.WhenAll(events...

will produce a single Task<TResult[]> so the variable should be named in singular.

Finally your last code make the same but is more succinct.

For reference: Task.Wait / Task.WhenAll


I used this code:

public static async Task<IEnumerable<TResult>> SelectAsync<TSource,TResult>(
    this IEnumerable<TSource> source, Func<TSource, Task<TResult>> method)
{
  return await Task.WhenAll(source.Select(async s => await method(s)));
}

like this:

var result = await sourceEnumerable.SelectAsync(async s=>await someFunction(s,other params));

Edit:

Some people have raised the issue of concurrency, like when you are accessing a database and you can't run two tasks at the same time. So here is a more complex version that also allows for a specific concurrency level:

public static async Task<IEnumerable<TResult>> SelectAsync<TSource, TResult>(
    this IEnumerable<TSource> source, Func<TSource, Task<TResult>> method,
    int concurrency = int.MaxValue)
{
    var semaphore = new SemaphoreSlim(concurrency);
    try
    {
        return await Task.WhenAll(source.Select(async s =>
        {
            try
            {
                await semaphore.WaitAsync();
                return await method(s);
            }
            finally
            {
                semaphore.Release();
            }
        }));
    } finally
    {
        semaphore.Dispose();
    }
}

Without a parameter it behaves exactly as the simpler version above. With a parameter of 1 it will execute all tasks sequentially:

var result = await sourceEnumerable.SelectAsync(async s=>await someFunction(s,other params),1);

Note: Executing the tasks sequentially doesn't mean the execution will stop on error!

Just like with a larger value for concurrency or no parameter specified, all the tasks will be executed and if any of them fail, the resulting AggregateException will contain the thrown exceptions.

If you want to execute tasks one after the other and fail at the first one, try another solution, like the one suggested by xhafan (https://stackoverflow.com/a/64363463/379279)


I prefer this as an extension method:

public static async Task<IEnumerable<T>> WhenAll<T>(this IEnumerable<Task<T>> tasks)
{
    return await Task.WhenAll(tasks);
}

So that it is usable with method chaining:

var inputs = await events
  .Select(async ev => await ProcessEventAsync(ev))
  .WhenAll()

With current methods available in Linq it looks quite ugly:

var tasks = items.Select(
    async item => new
    {
        Item = item,
        IsValid = await IsValid(item)
    });
var tuples = await Task.WhenAll(tasks);
var validItems = tuples
    .Where(p => p.IsValid)
    .Select(p => p.Item)
    .ToList();

Hopefully following versions of .NET will come up with more elegant tooling to handle collections of tasks and tasks of collections.


I have the same problem as @KTCheek in that I need it to execute sequentially. However I figured I would try using IAsyncEnumerable (introduced in .NET Core 3) and await foreach (introduced in C# 8). Here's what I have come up with:

public static class IEnumerableExtensions {
    public static async IAsyncEnumerable<TResult> SelectAsync<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, Task<TResult>> selector) {
        foreach (var item in source) {
            yield return await selector(item);
        }
    }
}

public static class IAsyncEnumerableExtensions {
    public static async Task<List<TSource>> ToListAsync<TSource>(this IAsyncEnumerable<TSource> source) {
        var list = new List<TSource>();

        await foreach (var item in source) {
            list.Add(item);
        }

        return list;
    }
}

This can be consumed by saying:

var inputs = await events.SelectAsync(ev => ProcessEventAsync(ev)).ToListAsync();

Update: Alternatively you can add a reference to System.Linq.Async and then you can say:

var inputs = await events
    .ToAsyncEnumerable()
    .SelectAwait(async ev => await ProcessEventAsync(ev))
    .ToListAsync();