Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

parallel.foreach and httpclient - strange behaviour

I have a piece of code that loops over a collection and calls httpclient for each iteration. The api that the httpclient calls, takes on average 30-40ms to execute. Calling it sequentially, I get the expected outcome, however as soon as I use Parallel.foreach, it takes longer. Looking closely in the logs, I can see quite a few httpclient calls take more 1000ms to execute and then the time drops back to 30-40ms. Looking in the api logs, I can see it barely goes over 100ms. I am not sure why I get this spike.

The code is

using (var client = new HttpClient())
{
  var content = new StringContent(parameters, Encoding.UTF8, "application/json");
  var response = client.PostAsync(url, content);
  _log.Info(string.Format("Took {0} ms to send post", watch.ElapsedMilliseconds));
  watch.Restart();

  var responseString = response.Result.Content.ReadAsStringAsync();
  _log.Info(string.Format("Took {0} ms to readstring after post", watch.ElapsedMilliseconds));
}

The parallel call is something like this

    Console.WriteLine("starting parallel...");
    Parallel.ForEach(recipientCollections, recipientCollection => 
      {    
        // A lot of processing happens here to create relevant content
        var secondaryCountryRecipientList = string.Join(",",refinedCountryRecipients);
        var emailApiParams = new SendEmailParametersModel(CountrySubscriberApplicationId,
                                        queueItem.SitecoreId, queueItem.Version, queueItem.Language, countryFeedItem.Subject,
                                        countryFeedItem.Html, countryFeedItem.From, _recipientsFormatter.Format(secondaryCountryRecipientList));

       log.Info(string.Format("Sending email request for {0}. Recipients {1}",                                        queueItem.SitecoreId, secondaryCountryRecipientList));

        var response = _notificationsApi.Invoke(emailApiParams);
        });

thanks

like image 314
Actuary Avatar asked Jul 06 '16 10:07

Actuary


People also ask

Why is parallel ForEach slower?

There is an amount of overhead to parallelism, and this overhead may or may not be significant depending on the complexity what is being parallelized. Since the work in your parallel function is very small, the overhead of the management the parallelism has to do becomes significant, thus slowing down the overall work.

Which is faster parallel ForEach or ForEach?

The execution of Parallel. Foreach is faster than normal ForEach.

Is parallel ForEach blocking?

No, it doesn't block and returns control immediately. The items to run in parallel are done on background threads.

Does ForEach work in parallel?

ForEach Method (System. Threading. Tasks) Executes a foreach (For Each in Visual Basic) operation in which iterations may run in parallel.


1 Answers

By default .NET allows only 2 connections per server. To change this you have to change the value of ServicePointManager.DefaultConnectionLimit to a larger value, eg 20 or 100.

This won't prevent flooding the server or consuming too much memory if you make too many requests though. A better option would be to use an ActionBlock< T> to buffer requests and send them in parallel in a controlled function, eg:

 ServicePointManager.DefaultConnectionLimit =20;

 var client = new HttpClient();

 var blockOptions=new ExecutionDataflowBlockOptions{MaxDegreeOfParallelism=10};

 var emailBlock=new ActionBlock<SendEmailParametersModel>(async arameters=>
     {
         var watch=new Stopwatch();
         var content = new StringContent(parameters, Encoding.UTF8, "application/json");
         var response = await client.PostAsync(url, content);
         _log.Info(..);
         watch.Restart();

         var responseString = await response.Result.Content.ReadAsStringAsync();
         _log.Info(...);
 });

Sending the emails doesn't require parallel invocation any more:

foreach(var recipientCollection in recipientCollections)
{
    var secondaryCountryRecipientList = string.Join(",",refinedCountryRecipients);
    var emailApiParams = new SendEmailParametersModel(CountrySubscriberApplicationId, queueItem.SitecoreId, queueItem.Version, queueItem.Language, countryFeedItem.Subject,countryFeedItem.Html, countryFeedItem.From, _recipientsFormatter.Format(secondaryCountryRecipientList));

   emailBlock.Post(emailApiParams);
   log.Info(...);
}
emailBlock.Complete();
await emailBlock.Completion();

HttpClient is thread-safe which allows you to use the same client for all requests.

The code above will buffer all requests and execute them 10 at a time. Calling Complete() tells the block to complete everything and stop processing new messages. await emailBlock.Completion() waits for all existing messages to finish before proceeding

like image 128
Panagiotis Kanavos Avatar answered Sep 28 '22 07:09

Panagiotis Kanavos