I have some code that performs batch HTML page processing using content from one specific host. It tries to make a large number (~400) of simultaneous HTTP requests using HttpClient
. I believe that the maximum number of simultaneous connections is restricted by ServicePointManager.DefaultConnectionLimit
, so I'm not applying my own concurrency restrictions.
After sending all of the requests asynchronously to HttpClient
using Task.WhenAll
, the entire batch operation can be cancelled using CancellationTokenSource
and CancellationToken
. The progress of the operation is viewable via a user interface, and a button can be clicked to perform the cancellation.
The call to CancellationTokenSource.Cancel()
blocks for roughly 5 - 30 seconds. This causes the user interface to freeze. Is suspect that this occurs because the method is calling the code that registered for cancellation notification.
HttpClient
already seems to queue excess requests itself.CancellationTokenSource.Cancel()
method call in a non-UI thread. This didn't work too well; the task didn't actually run until most of the others had finished. I think an async
version of the method would work well, but I couldn't find one. Also, I have the impression that it's suitable to use the method in a UI thread.class Program
{
private const int desiredNumberOfConnections = 418;
static void Main(string[] args)
{
ManyHttpRequestsTest().Wait();
Console.WriteLine("Finished.");
Console.ReadKey();
}
private static async Task ManyHttpRequestsTest()
{
using (var client = new HttpClient())
using (var cancellationTokenSource = new CancellationTokenSource())
{
var requestsCompleted = 0;
using (var allRequestsStarted = new CountdownEvent(desiredNumberOfConnections))
{
Action reportRequestStarted = () => allRequestsStarted.Signal();
Action reportRequestCompleted = () => Interlocked.Increment(ref requestsCompleted);
Func<int, Task> getHttpResponse = index => GetHttpResponse(client, cancellationTokenSource.Token, reportRequestStarted, reportRequestCompleted);
var httpRequestTasks = Enumerable.Range(0, desiredNumberOfConnections).Select(getHttpResponse);
Console.WriteLine("HTTP requests batch being initiated");
var httpRequestsTask = Task.WhenAll(httpRequestTasks);
Console.WriteLine("Starting {0} requests (simultaneous connection limit of {1})", desiredNumberOfConnections, ServicePointManager.DefaultConnectionLimit);
allRequestsStarted.Wait();
Cancel(cancellationTokenSource);
await WaitForRequestsToFinish(httpRequestsTask);
}
Console.WriteLine("{0} HTTP requests were completed", requestsCompleted);
}
}
private static void Cancel(CancellationTokenSource cancellationTokenSource)
{
Console.Write("Cancelling...");
var stopwatch = Stopwatch.StartNew();
cancellationTokenSource.Cancel();
stopwatch.Stop();
Console.WriteLine("took {0} seconds", stopwatch.Elapsed.TotalSeconds);
}
private static async Task WaitForRequestsToFinish(Task httpRequestsTask)
{
Console.WriteLine("Waiting for HTTP requests to finish");
try
{
await httpRequestsTask;
}
catch (OperationCanceledException)
{
Console.WriteLine("HTTP requests were cancelled");
}
}
private static async Task GetHttpResponse(HttpClient client, CancellationToken cancellationToken, Action reportStarted, Action reportFinished)
{
var getResponse = client.GetAsync("http://www.google.com", cancellationToken);
reportStarted();
using (var response = await getResponse)
response.EnsureSuccessStatusCode();
reportFinished();
}
}
Why does cancellation block for so long? Also, is there anything that I'm doing wrong or could be doing better?
Performing the CancellationTokenSource.Cancel() method call in a non-UI thread. This didn't work too well; the task didn't actually run until most of the others had finished.
What this tells me is that you're probably suffering from 'threadpool exhaustion', which is where your threadpool queue has so many items in it (from HTTP requests completing) that it takes a while to get through them all. Cancellation probably is blocking on some threadpool work item executing and it can't skip to the head of the queue.
This suggests that you do need to go with option 1 from your consideration list. Throttle your own work so that the threadpool queue remains relatively short. This is good for app responsiveness overall anyway.
My favorite way to throttle async work is to use Dataflow. Something like this:
var block = new ActionBlock<Uri>(
async uri => {
var httpClient = new HttpClient(); // HttpClient isn't thread-safe, so protect against concurrency by using a dedicated instance for each request.
var result = await httpClient.GetAsync(uri);
// do more stuff with result.
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 20, CancellationToken = cancellationToken });
for (int i = 0; i < 1000; i++)
block.Post(new Uri("http://www.server.com/req" + i));
block.Complete();
await block.Completion; // waits until everything is done or canceled.
As an alternative, you could use Task.Factory.StartNew passing in TaskCreationOptions.LongRunning so your task gets a new thread (not affiliated with threadpool) which would allow it to start immediately and call Cancel from there. But you should probably solve the threadpool exhaustion problem instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With