Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel HTTP requests using System.Net.Http.HttpClient

I'm trying to figure out the correct way to parallelize HTTP requests using Task and async/await. I'm using the HttpClient class which already has async methods for retrieving data. If I just call it in a foreach loop and await the response, only one request gets sent at a time (which makes sense because during the await, control is returning to our event loop, not to the next iteration of the foreach loop).

My wrapper around HttpClient looks as such

public sealed class RestClient
{
    private readonly HttpClient client;

    public RestClient(string baseUrl)
    {
        var baseUri = new Uri(baseUrl);

        client = new HttpClient
        {
            BaseAddress = baseUri
        };
    }

    public async Task<Stream> GetResponseStreamAsync(string uri)
    {
        var resp = await GetResponseAsync(uri);
        return await resp.Content.ReadAsStreamAsync();
    }

    public async Task<HttpResponseMessage> GetResponseAsync(string uri)
    {
        var resp = await client.GetAsync(uri);
        if (!resp.IsSuccessStatusCode)
        {
            // ...
        }

        return resp;
    }

    public async Task<T> GetResponseObjectAsync<T>(string uri)
    {
        using (var responseStream = await GetResponseStreamAsync(uri))
        using (var sr = new StreamReader(responseStream))
        using (var jr = new JsonTextReader(sr))
        {
            var serializer = new JsonSerializer {NullValueHandling = NullValueHandling.Ignore};
            return serializer.Deserialize<T>(jr);
        }
    }

    public async Task<string> GetResponseString(string uri)
    {
        using (var resp = await GetResponseStreamAsync(uri))
        using (var sr = new StreamReader(resp))
        {
            return sr.ReadToEnd();
        }
    }
}

And the code invoked by our event loop is

public async void DoWork(Action<bool> onComplete)
{
    try
    {
        var restClient = new RestClient("https://example.com");

        var ids = await restClient.GetResponseObjectAsync<IdListResponse>("/ids").Ids;

        Log.Info("Downloading {0:D} items", ids.Count);
        using (var fs = new FileStream(@"C:\test.json", FileMode.Create, FileAccess.Write, FileShare.Read))
        using (var sw = new StreamWriter(fs))
        {
            sw.Write("[");

            var first = true;
            var numCompleted = 0;
            foreach (var id in ids)
            {
                Log.Info("Downloading item {0:D}, completed {1:D}", id, numCompleted);
                numCompleted += 1;
                try
                {
                    var str = await restClient.GetResponseString($"/info/{id}");
                    if (!first)
                    {
                        sw.Write(",");
                    }

                    sw.Write(str);

                    first = false;
                }
                catch (HttpException e)
                {
                    if (e.StatusCode == HttpStatusCode.Forbidden)
                    {
                        Log.Warn(e.ResponseMessage);
                    }
                    else
                    {
                        throw;
                    }
                }
            }

            sw.Write("]");
        }

        onComplete(true);
    }
    catch (Exception e)
    {
        Log.Error(e);
        onComplete(false);
    }
}

I've tried a handful of different approaches involving Parallel.ForEach, Linq.AsParallel, and wrapping the entire contents of the loop in a Task.

like image 491
Austin Wagner Avatar asked Feb 09 '17 15:02

Austin Wagner


People also ask

How many requests can a HttpClient handle?

The API only allows 10 requests at the same time. If we send more than that they will respond with the 429 Too Many Requests status code.

Is C# HttpClient thread-safe?

Since HttpClient instances are thread-safe and don't hold much in the way of state (except if you're setting up, eg, default headers or base urls), you can also use a singleton pattern with them - this is a performant way to do a lot of concurrent http requests.

How do you handle concurrent requests in C#?

How to handle concurrency in ASP.NET Core Web API. Create an empty project and update the Startup class to add services and middleware for MVC. Add a controller with GET and PUT to demonstrate concurrency. Send a GET request and observe the ETag header (using Postman).

Should we create a new single instance of HttpClient for all requests?

Should we create a new single instance of HttpClient for all requests? The correct way as per the post is to create a single instance of HttpClient as it helps to reduce waste of sockets.


1 Answers

The basic idea is to keep of track of all the asynchronous tasks, and awaiting them at once. The simplest way to do this is to extract the body of your foreach to a separate asynchronous method, and do something like this:

var tasks = ids.Select(i => DoWorkAsync(i));
await Task.WhenAll(tasks);

This way, the individual tasks are issued separately (still in sequence, but without waiting for the I/O to complete), and you await them all at the same time.

Do note that you will also need to do some configuration - HTTP is throttled by default to only allow two simultaneous connections to the same server.

like image 111
Luaan Avatar answered Oct 11 '22 12:10

Luaan