Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use C#8 IAsyncEnumerable<T> to async-enumerate tasks run in parallel

If possible I want to create an async-enumerator for tasks launched in parallel. So first to complete is first element of the enumeration, second to finish is second element of the enumeration, etc.

public static async IAsyncEnumerable<T> ParallelEnumerateAsync(this IEnumerable<Task<T>> coldAsyncTasks)
{
    // ... 
}

I bet there is a way using ContinueWith and a Queue<T>, but I don't completely trust myself to implement it.

like image 413
i cant codez Avatar asked Jun 09 '19 20:06

i cant codez


3 Answers

Is this what you're looking for?

public static async IAsyncEnumerable<T> ParallelEnumerateAsync<T>(
    this IEnumerable<Task<T>> tasks)
{
    var remaining = new List<Task<T>>(tasks);

    while (remaining.Count != 0)
    {
        var task = await Task.WhenAny(remaining);
        remaining.Remove(task);
        yield return (await task);
    }
}
like image 142
Paulo Morgado Avatar answered Oct 23 '22 08:10

Paulo Morgado


If I understand your question right, your focus is to launch all tasks, let them all run in parallel, but make sure the return values are processed in the same order as the tasks were launched.

Checking out the specs, with C# 8.0 Asynchronous Streams task queuing for parallel execution but sequential return can look like this.

/// Demonstrates Parallel Execution - Sequential Results with test tasks
async Task RunAsyncStreams()
{
    await foreach (var n in RunAndPreserveOrderAsync(GenerateTasks(6)))
    {
        Console.WriteLine($"#{n} is returned");
    }
}

/// Returns an enumerator that will produce a number of test tasks running
/// for a random time.
IEnumerable<Task<int>> GenerateTasks(int count)
{
    return Enumerable.Range(1, count).Select(async n =>
    {
        await Task.Delay(new Random().Next(100, 1000));
        Console.WriteLine($"#{n} is complete");
        return n;
    });
}

/// Launches all tasks in order of enumeration, then waits for the results
/// in the same order: Parallel Execution - Sequential Results.
async IAsyncEnumerable<T> RunAndPreserveOrderAsync<T>(IEnumerable<Task<T>> tasks)
{
    var queue = new Queue<Task<T>>(tasks);
    while (queue.Count > 0) yield return await queue.Dequeue();
}

Possible output:

#5 is complete
#1 is complete
#1 is returned
#3 is complete
#6 is complete
#2 is complete
#2 is returned
#3 is returned
#4 is complete
#4 is returned
#5 is returned
#6 is returned

On a practical note, there doesn't seem to be any new language-level support for this pattern, and besides since the asynchronous streams deal with IAsyncEnumerable<T>, it means that a base Task would not work here and all the worker async methods should have the same Task<T> return type, which somewhat limits asynchronous streams-based design.

Because of this and depending on your situation (Do you want to be able to cancel long-running tasks? Is per-task exception handling required? Should there be a limit to the number of concurrent tasks?) it might make sense to check out @TheGeneral 's suggestions up there.

Update:

Note that RunAndPreserveOrderAsync<T> does not necessarily have to use a Queue of tasks - this was only chosen to better show coding intentions.

var queue = new Queue<Task<T>>(tasks);
while (queue.Count > 0) yield return await queue.Dequeue();

Converting an enumerator to List would produce the same result; the body of RunAndPreserveOrderAsync<T> can be replaced with one line here

foreach(var task in tasks.ToList()) yield return await task;

In this implementation it is important that all the tasks are generated and launched first, which is done along with Queue initialization or a conversion of tasks enumerable to List. However, it might be hard to resist simplifying the above foreach line like this

foreach(var task in tasks) yield return await task;

which would cause the tasks being executed sequentially and not running in parallel.

like image 40
DK. Avatar answered Oct 23 '22 07:10

DK.


My take on this task. Borrowed heavily from other answers in this topic, but with (hopefully) some enhancements. So the idea is to start tasks and put them in a queue, same as in the other answers, but like Theodor Zoulias, I'm also trying to limit the max degree of parallelism. However I tried to overcome the limitation he mentioned in his comment by using task continuation to queue the next task as soon as any of the previous tasks completes. This way we are maximizing the number of simultaneously running tasks, within the configured limit, of course.

I'm not an async expert, this solution might have multithreading deadlocks and other Heisenbugs, I did not test exception handling etc, so you've been warned.

public static async IAsyncEnumerable<TResult> ExecuteParallelAsync<TResult>(IEnumerable<Task<TResult>> coldTasks, int degreeOfParallelism)
{
    if (degreeOfParallelism < 1)
        throw new ArgumentOutOfRangeException(nameof(degreeOfParallelism));

    if (coldTasks is ICollection<Task<TResult>>) throw new ArgumentException(
        "The enumerable should not be materialized.", nameof(coldTasks));

    var queue = new ConcurrentQueue<Task<TResult>>();

    using var enumerator = coldTasks.GetEnumerator();
    
    for (var index = 0; index < degreeOfParallelism && EnqueueNextTask(); index++) ;

    while (queue.TryDequeue(out var nextTask)) yield return await nextTask;

    bool EnqueueNextTask()
    {
        lock (enumerator)
        {
            if (!enumerator.MoveNext()) return false;

            var nextTask = enumerator.Current
                .ContinueWith(t =>
                {
                    EnqueueNextTask();
                    return t.Result;
                });
            queue.Enqueue(nextTask);
            return true;
        }
    }
}

We use this method to generate testing tasks (borrowed from DK's answer):

IEnumerable<Task<int>> GenerateTasks(int count)
{
    return Enumerable.Range(1, count).Select(async n =>
    {
        Console.WriteLine($"#{n} started");
        await Task.Delay(new Random().Next(100, 1000));
        Console.WriteLine($"#{n} completed");
        return n;
    });
}

And also his(or her) test runner:

async void Main()
{
    await foreach (var n in ExecuteParallelAsync(GenerateTasks(9),3))
    {
        Console.WriteLine($"#{n} returned");
    }
}

And we get this result in LinqPad (which is awesome, BTW)

#1 started
#2 started
#3 started
#3 is complete
#4 started
#2 is complete
#5 started
#1 is complete
#6 started
#1 is returned
#2 is returned
#3 is returned
#4 is complete
#7 started
#4 is returned
#6 is complete
#8 started
#7 is complete
#9 started
#8 is complete
#5 is complete
#5 is returned
#6 is returned
#7 is returned
#8 is returned
#9 is complete
#9 is returned

Note how the next task starts as soon as any of the previous tasks completes, and how the order in which they return is still preserved.

like image 29
Zar Shardan Avatar answered Oct 23 '22 08:10

Zar Shardan