Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the purpose of the *Async methods in .Net Framework given the ability to run any method asynchronously using Task.Run?

Short question:

Why did .Net Framework add a lot of *Async versions of method instead of developers just using Task.Run to run synchronous methods asynchronously?

Detailed question:

  • I understand the concept of asynchronisity.
  • I know about Tasks
  • I know about the async/await keywords.
  • I know what *Async methods in .Net Framework do.

What I don't understand is the purpose of the *Async methods in the library.

Suppose that you have two lines of code:

F1();
F2();

With respect to the data/control flow there are only two cases:

  • F2 need to be executed after F1 finishes.
  • F2 does not need to wait for F1 to finish.

I don't see any other cases. I don't see any general need to know the concrete thread that executes some function (apart from UI). The base execution mode of code in a thread is synchronous. The parallelism requires multiple threads. The asynchronisity is based on parallelism and code reordering. But the base is still synchronous.

The difference does not matter when the F1's workload is small. But when A takes a lot of time to finish, we may need to look at the situation and, if F2 does not need to wait for F1 to finish, we can run F1 in parallel with F2.

Long time ago we did that using threads/thread pools. Now we have Tasks.

If we want to run F1 and F2 in parallel, we can write:

var task1 = Task.Run(F1);
F2();

tasks are cool and we can use await in places where we finally need the task to be finished.

So far, I don't see any need to make an F1Async() method.

Now, let's look at some special cases. The only real special case I see is UI. The UI thread is special and stalling it makes the UI freeze which is bad. As I see it, Microsoft advices us to mark the UI event handlers async. Marking the methods async means that we can use the await keyword to basically schedule the heavy processing on another thread and free the UI thread until the processing is finished.

What I don't get again is why do we need any *Async methods to be able to await them. We can always just write await Task.Run(F1);. Why would we need F1Async?

You may say that the *Async methods use some special magic (like handling external signals) that make them more efficient than their synchronous counterparts. The thing is that I don't see this beeing the case.

Let's look at the Stream.ReadAsync for example. If you look at the source code, ReadAsync just wastes several hundred lines of bells and whistles code to create a task that just calls the synchronous Read method. Why do we need it then? Why not just use Task.Run with Stream.Read?

This is why I don't understand the need to bloat the libraries by creating the trivial *Async copies of synchronous methods. MS could have even added the syntactic sugar, so that we could write await async Stream.Read instead of await Stream.ReadAsync or Task.Run(Stream.Read).

Now you may ask "Why not make the *Async methods the only ones and remove the synchronous methods?". As I've said earlier, the base code execution mode is synchronous. It's easy to run synchronous method asynchronously, but not the other way.

So, what is the purpose of the *Async methods in .Net Framework given the ability to run any method asynchronously using Task.Run?

P.S. If the non-freezing the UI is so important, why not just run the handlers async by default and prevent any chance of freezing?

The "no threads" argument:

People answering this question seem to imply that the advantage of *Async methods is that they are efficient because they don't create new threads. The problem is that I don't see such behavior. The parallel asynchronous tasks behave just like I thought - a thread is created (or taken from the thread pool) for each parallel task (not all tasks are executed in parallel though).

Here is my test code:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;

namespace ConsoleApplication32167 {
    class Program {
        static async Task TestAsync() {
            var httpClient = new HttpClient() { Timeout = TimeSpan.FromMinutes(20) };

            var tasks = Enumerable.Range(1, 100).Select((i) =>
                httpClient.GetStringAsync("http://localhost/SlowWebsite/"));

            Console.WriteLine("Threads before completion: " + Process.GetCurrentProcess().Threads.Count);

            await Task.WhenAll(tasks);

            Console.WriteLine("Threads after completion: " + Process.GetCurrentProcess().Threads.Count);
        }

        static void Main(string[] args) {
            Console.WriteLine("Threads at start: " + Process.GetCurrentProcess().Threads.Count);

            var timer = new Stopwatch();
            timer.Start();

            var testTask = TestAsync();

            var distinctThreadIds = new HashSet<int>();
            while (!testTask.IsCompleted) {
                var threadIds = Process.GetCurrentProcess().Threads.OfType<ProcessThread>().Select(thread => thread.Id).ToList();
                distinctThreadIds.UnionWith(threadIds);
                Console.WriteLine("Current thread count: {0}; Cumulative thread count: {1}.", threadIds.Count, distinctThreadIds.Count);
                Thread.Sleep(250);
            }

            testTask.Wait();

            Console.WriteLine(timer.Elapsed);
            Console.ReadLine();
        }
    }
}

This code tries to run 100 HttpClient.GetStringAsync tasks making requests to a website that takes 1 minute to respond. At the same time it counts the number of active threads and the cumulative number of different created by the process. As I've predicted, this program creates many new threads. The output looks like this:

Current thread count: 4; Cumulative thread count: 4.
....
Current thread count: 25; Cumulative thread count: 25.
....
Current thread count: 7; Cumulative thread count: 63.
Current thread count: 9; Cumulative thread count: 65.
00:10:01.9981006

This means that:

  • 61 new threads are created during the course of the async task execution.
  • The peak number of new active threads is 21.
  • The execution takes 10x more time (10 minutes instead of 1).This was caused by the local IIS limits.
like image 512
Ark-kun Avatar asked Nov 30 '22 11:11

Ark-kun


1 Answers

Marking the methods async means that we can use the await keyword to basically schedule the heavy processing on another thread and free the UI thread until the processing is finished.

That's not at all how async works. See my async intro.

You may say that the *Async methods use some special magic (like handling external signals) that make them more efficient than their synchronous counterparts. The thing is that I don't see this beeing the case.

In pure asynchronous code, there is no thread (as I explain on my blog). In fact, at the device driver level, all (non-trivial) I/O is asynchronous. It is the synchronous APIs (at the OS level) that are an abstraction layer over the natural, asynchronous APIs.

Let's look at the Stream.ReadAsync for example.

Stream is an unusual case. As a base class, it has to prevent breaking changes as much as possible. So, when they added the virtual ReadAsync method, they had to add a default implementation. This implementation has to use a non-ideal implementation (Task.Run), which is unfortunate. In an ideal world, ReadAsync would be (or call) an abstract asynchronous implementation, but that would break every existing implementation of Stream.

For a more proper example, compare the difference between WebClient and HttpClient.

like image 81
Stephen Cleary Avatar answered Dec 05 '22 02:12

Stephen Cleary