Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Task.StartNew() vs Parallel.ForEach : Multiple Web Requests Scenario

I have read through all the related questions in SO, but a little confused on the best approach for my scenario where multiple web service calls are fired.

I have an aggregator service that takes an input, parses and translates it into multiple web requests, makes the web request calls (unrelated, so could be fired in parallel) and consolidates the response which is sent back to the caller. The following code is used right now -

list.ForEach((object obj) =>
{
     tasks.Add(Task.Factory.StartNew((object state) => 
     {
           this.ProcessRequest(obj);
     }, obj, CancellationToken.None,
     TaskCreationOptions.AttachedToParent, TaskScheduler.Default));
});
await Task.WhenAll(tasks);

the await Task.WhenAll(tasks) comes from Scott Hanselman's post where it is said that

"A better solution from a scalability perspective, says Stephen, is to take advantage of asynchronous I/O. When you're calling out across the network, there's no reason (other than convenience) to blocks threads while waiting for the response to come back"

The existing code appears to consume too many threads and the Processor Time shoots up to 100% on production load and that gets me thinking.

The other alternate is to use Parallel.ForEach which uses a partitioner but and also "blocks" the call, which is fine for my scenario.

Considering this is all "Async IO" work and not "CPU bound" work, and the web requests are not long running (return in max 3 seconds), I tend to believe the existing code is good enough. But would this provide better throughput than Parallel.ForEach? Parallel.ForEach probably uses "minimal" number of Tasks because of the partitioning and therefore optimal use of threads(?). I did test Parallel.ForEach with some local tests and it doesn't appear to be any better.

The goal is to reduce the CPU time and increase throughput and therefore better scalability. Is there a better approach for handling web requests in parallel?

Appreciate any inputs, thanks.

EDIT: ProcessRequest method shown in the code sample indeed uses HttpClient and its async methods to fire requests (PostAsync, GetAsync, PutAsync).

like image 510
Lalman Avatar asked Jun 05 '15 02:06

Lalman


2 Answers

makes the web request calls (unrelated, so could be fired in parallel)

What you actually want is to call them concurrently, not in parallel. That is, "at the same time", not "using multiple threads".

The existing code appears to consume too many threads

Yeah, I think so too. :)

Considering this is all "Async IO" work and not "CPU bound" work

Then it should all be done asynchronously, and not using task parallelism or other parallel code.

As Antii pointed out, you should make your asynchronous code asynchronous:

public async Task ProcessRequestAsync(...);

Then what you want to do is consume it using asynchronous concurrency (Task.WhenAll), not parallel concurrency (StartNew/Run/Parallel):

await Task.WhenAll(list.Select(x => ProcessRequestAsync(x)));
like image 186
Stephen Cleary Avatar answered Oct 05 '22 23:10

Stephen Cleary


If you are CPU bound (you are - "Processor Time shoots up to 100% ") you need to reduce CPU usage. Async IO does nothing to help with that. If anything it causes a little more CPU usage (unnoticeable here).

Profile the app to see what takes so much CPU time and optimize that code.

The way you initiate parallelism (Parallel, Task, async IO) does nothing to the efficiency of the parallel action itself. The network does not get faster if you call it in an async way. It's the same hardware still. Also no less CPU usage.

Determine the optimal degree of parallelism experimentally and choose a parallelism technique that is suitable for that degree. If it's a few dozen then threads are totally fine. If it's in the hundreds seriously consider async IO.

like image 37
usr Avatar answered Oct 05 '22 23:10

usr