Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is synchronous code inside asynchronously awaited Task much slower than asynchronous code

I have been playing out of boredom with retrieving random articles from wiki all at the same time. First I wrote this code:

private async void Window_Loaded(object sender, RoutedEventArgs e)
{
    await DownloadAsync();
}

private async Task DownloadAsync()
    {
        Stopwatch sw = new Stopwatch();
        sw.Start();
        var tasks = new List<Task>();
        var result = new List<string>();

        for (int index = 0; index < 60; index++)
        {
            var task = Task.Run(async () => {
                var scheduledAt = DateTime.UtcNow.ToString("mm:ss.fff");
                using (var client = new HttpClient())
                using (var response = await client.GetAsync("https://en.wikipedia.org/wiki/Special:Random"))
                using (var content = response.Content)
                {
                    var page = await content.ReadAsStringAsync();
                    var receivedAt = DateTime.UtcNow.ToString("mm:ss.fff");
                    var data = $"Job done at thread: {Thread.CurrentThread.ManagedThreadId}, Scheduled at: {scheduledAt}, Recieved at: {receivedAt} {page}";
                    result.Add(data);
                }
            });

            tasks.Add(task);
        }

        await Task.WhenAll(tasks.ToArray());

        sw.Stop();
        Console.WriteLine($"Process took: {sw.Elapsed.Seconds} sec {sw.Elapsed.Milliseconds} ms");

        foreach (var item in result)
        {
            Debug.WriteLine(item);
        }
    }

But I wanted to get rid of this async anonymous method: Task.Run(async () => ..., so I replaced relevant part of code to this:

for (int index = 0; index < 60; index++)
{
    var task = Task.Run(() => {
        var scheduledAt = DateTime.UtcNow.ToString("mm:ss.fff");
        using (var client = new HttpClient())
        // Get this synchronously.
        using (var response = client.GetAsync("https://en.wikipedia.org/wiki/Special:Random").Result)
        using (var content = response.Content)
        {
            // Get this synchronously.
            var page = content.ReadAsStringAsync().Result;
            var receivedAt = DateTime.UtcNow.ToString("mm:ss.fff");
            var data = $"Job done at thread: {Thread.CurrentThread.ManagedThreadId}, Scheduled at: {scheduledAt}, Recieved at: {receivedAt} {page}";
            result.Add(data);
        }
    });

    tasks.Add(task);
}

I expected it to perform exactly the same, because the asynchronous code I replaced with synchronous is wrapped inside a task, so I'm guaranteed that the task scheduler (WPF task scheduler) will queue it on some free thread from ThreadPool. And this is exactly what happens as I look at returned result I get values such as:

Job done at thread: 6, Scheduled at: 53:57.534, Recieved at: 54:54.545 ...
Job done at thread: 21, Scheduled at: 54:06.742, Recieved at: 54:54.574 ...
Job done at thread: 41, Scheduled at: 54:26.742, Recieved at: 54:54.576 ...
Job done at thread: 10, Scheduled at: 53:59.018, Recieved at: 54:54.614 ...

The problem is that the first code executes in ~6 seconds and the second one (with synchronous .Result) takes ~50 seconds. The difference gets smaller as I decrease number of tasks. Can anyone explain why they take so long, even though they execute on separate threads and perform exactly the same single operation?

like image 379
FCin Avatar asked Mar 16 '18 10:03

FCin


1 Answers

Because thread pool might introduce a delay when you request new thread, if total number of threads in a pool is greater than configurable minimum. That minimum is number of cores by default. In example with .Result, you queue 60 tasks which all hold thread pool thread for the whole duration of their execution. That means only number of cores tasks will start immediately, then rest will start with a delay (thread pool will wait for a certain time if already busy thread becomes available, and if not - will add new thread).

Even worse - continuations of client.GetAsync (the code which executes inside GetAsync function after it received reply from server) are also scheduled to thread pool thread. That holds all 60 of your tasks, because they cannot complete before receiving result from GetAsync, and GetAsync needs free thread pool thread to run its continuation. In result, there is an additional contention: there are 60 tasks you created, and there are 60 continuations from GetAsync which also want thread pool thread to run (while your 60 tasks are blocked waiting for result of those continuations).

In example with await - thread pool thread is released for the duration of asynchornous http call. So when you call await GetAsync() and that GetAsync reaches point of asynchronous IO (actually makes http request) - your thread is released back to the pool. Now it's free to handle other requests. That means await example holds thread pool threads for much less time, and there is (almost) no delay while waiting for thread pool thread to become available.

You can easily confirm this by doing (DON'T USE IN REAL CODE, for testing only)

ThreadPool.SetMinThreads(100, 100);

to increase configurable minimum number of threads in a pool mentioned above. When you increase it to large value - all 60 tasks in example with .Result will start at the same time on 60 thread pool threads, without delays, and so both your examples will complete in roughly the same time.

Here is sample application to observe how it works:

public class Program {
    public static void Main(string[] args) {
        DownloadAsync().Wait();
        Console.ReadKey();
    }

    private static async Task DownloadAsync() {
        Stopwatch sw = new Stopwatch();
        sw.Start();
        var tasks = new List<Task>();
        for (int index = 0; index < 60; index++) {
            var tmp = index;
            var task = Task.Run(() => {
                ThreadPool.GetAvailableThreads(out int wt, out _);
                ThreadPool.GetMaxThreads(out int mt, out _);
                Console.WriteLine($"Started: {tmp} on thread {Thread.CurrentThread.ManagedThreadId}. Threads in pool: {mt - wt}");
                var res = DoStuff(tmp).Result;
                Console.WriteLine($"Done {res} on thread {Thread.CurrentThread.ManagedThreadId}");
            });

            tasks.Add(task);
        }

        await Task.WhenAll(tasks.ToArray());

        sw.Stop();
        Console.WriteLine($"Process took: {sw.Elapsed.Seconds} sec {sw.Elapsed.Milliseconds} ms");
    }

    public static async Task<string> DoStuff(int i) {
        await Task.Delay(1000); // web request
        Console.WriteLine($"continuation of {i} on thread {Thread.CurrentThread.ManagedThreadId}"); // continuation
        return i.ToString();
    }
}
like image 137
Evk Avatar answered Oct 29 '22 13:10

Evk