I'm working on performance optimization of the program which widely uses async/await feature. Generally speaking it downloads thousands of json documents through HTTP in parallel, parses them and builds some response using this data. We experience some issues with performance, when we handle many requests simultaneously (e.g. download 1000 jsons), we can see that a simple HTTP request can take a few minutes.
I wrote a small console app to test it on a simplified example:
class Program
{
static void Main(string[] args)
{
for (int i = 0; i < 100000; i++)
{
Task.Run(IoBoundWork);
}
Console.ReadKey();
}
private static async Task IoBoundWork()
{
var sw = Stopwatch.StartNew();
await Task.Delay(1000);
Console.WriteLine(sw.Elapsed);
}
}
And I can see similar behavior here:
The question is why "await Task.Delay(1000)" eventually takes 23 sec.
I found out that running async-await can be much slower in some scenarios. But if I click on the 'both' button, the 'await' version is ~3-4 times slower than the promises version.
When you're dealing with external REST APIs that take multiple seconds to respond, then the async version is substantially "faster" because your process can get some other useful work done while it's waiting.
Yes, you read that right. The V8 team made improvements that make async/await functions run faster than traditional promises in the JavaScript engine.
The biggest advantage of using async and await is, it is very simple and the asynchronous method looks very similar to a normal synchronous methods. It does not change programming structure like the old models (APM and EAP) and the resultant asynchronous method look similar to synchronous methods.
Task.Delay
isn't broken, but you're performing 100,000 tasks which each take some time. It's the call to Console.WriteLine
that is causing the problem in this particular case. Each call is cheap, but they're accessing a shared resource, so they aren't very highly parallelizable.
If you remove the call to Console.WriteLine
, all the tasks complete very quickly. I changed your code to return the elapsed time that each task observes, and then print just a single line of output at the end - the maximum observed time. On my computer, without any Console.WriteLine
call, I see output of about 1.16 seconds, showing very little inefficiency:
using System;
using System.Linq;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
class Program
{
static void Main(string[] args)
{
ThreadPool.SetMinThreads(50000, 50000);
var tasks = Enumerable.Repeat(0, 100000)
.Select(_ => Task.Run(IoBoundWork))
.ToArray();
Task.WaitAll(tasks);
var maxTime = tasks.Max(t => t.Result);
Console.WriteLine($"Max: {maxTime}");
}
private static async Task<double> IoBoundWork()
{
var sw = Stopwatch.StartNew();
await Task.Delay(1000);
return sw.Elapsed.TotalSeconds;
}
}
You can then modify IoBoundWork
to do different tasks, and see the effect. Examples of work to try:
Console.WriteLine
await foo.WriteAsync(...)
etc)You can also try removing the call to Task.Delay(1000)
or changing it. I found that by removing it entirely, the result was very small - whereas replacing it with Task.Yield
was very similar to Task.Delay
. It's worth remembering that as soon as your async method has to actually "pause" you're effectively doubling the task scheduling problem - instead of scheduling 100,000 operations, you're scheduling 200,000.
You'll see a different pattern in each case. Fundamentally, you're starting 100,000 tasks, asking them all to wait for a second, then asking them all to do something. That causes issues in terms of continuation scheduling that's async/await specific, but also plain resource management of "Performing 100,000 tasks each of which needs to write to the console is going to take a while."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With