Possible Duplicate:
Parallel.ForEach vs Task.Factory.StartNew
I need to run about 1,000 tasks in a ThreadPool
on a nightly basis (the number may grow in the future). Each task is performing a long running operation (reading data from a web service) and is not CPU intensive. Async I/O
is not an option for this particular use case.
Given an IList<string>
of parameters, I need to DoSomething(string x)
. I am trying to pick between the following two options:
IList<Task> tasks = new List<Task>();
foreach (var p in parameters)
{
tasks.Add(Task.Factory.StartNew(() => DoSomething(p), TaskCreationOptions.LongRunning));
}
Task.WaitAll(tasks.ToArray());
OR
Parallel.ForEach(parameters, new ParallelOptions {MaxDegreeOfParallelism = Environment.ProcessorCount*32}, DoSomething);
Which option is better and why?
Note :
The answer should include a comparison between the usage of TaskCreationOptions.LongRunning
and MaxDegreeOfParallelism = Environment.ProcessorCount * SomeConstant
.
Perhaps you aren't aware of this, but the members in the Parallel
class are simply (complicated) wrappers around Task
objects. In case you're wondering, the Parallel
class creates the Task
objects with TaskCreationOptions.None
. However, the MaxDegreeOfParallelism
would affect those task objects no matter what creation options were passed to the task object's constructor.
TaskCreationOptions.LongRunning
gives a "hint" to the underlying TaskScheduler
that it might perform better with oversubscription of the threads. Oversubscription is good for threads with high-latency, for example I/O, because it will assign more than one thread (yes thread, not task) to a single core so that it will always have something to do, instead of waiting around for an operation to complete while the thread is in a waiting state. On the TaskScheduler
that uses the ThreadPool
, it will run LongRunning tasks on their own dedicated thread (the only case where you have a thread per task), otherwise it will run normally, with scheduling and work stealing (really, what you want here anyway)
MaxDegreeOfParallelism
controls the number of concurrent operations run. It's similar to specifying the max number of paritions that the data will be split into and processed from. If TaskCreationOptions.LongRunning
were able to be specified, all this would do would be to limit the number of tasks running at a single time, similar to a TaskScheduler
whose maximum concurrency level is set to that value, similar to this example.
You might want the Parallel.ForEach
. However, adding MaxDegreeOfParallelism
equal to such a high number actually won't guarantee that there will be that many threads running at once, since the tasks will still be controlled by the ThreadPoolTaskScheduler
. That scheduler will the number of threads running at once to the smallest amount possible, which I suppose is the biggest difference between the two methods. You could write (and specify) your own TaskScheduler
that would mimic the max degree of parallelism behavior, and have the best of both worlds, but I'm doubting that something you're interested in doing.
My guess is that, depending on latency and the number of actual requests you need to do, using tasks will perform better in many(?) cases, though wind up using more memory, while parallel will be more consistent in resource usage. Of course, async I/O will perform monstrously better than any of these two options, but I understand you can't do that because you're using legacy libraries. So, unfortunately, you'll be stuck with mediocre performance no matter which one of those you chose.
A real solution would be to figure out a way to make async I/O happen; since I don't know the situation, I don't think I can be more helpful than that. Your program (read, thread) will continue execution, and the kernel will wait for the I/O operation to complete (this is also known as using I/O completion ports). Because the thread is not in a waiting state, the runtime can do more work on less threads, which usually ends up in an optimal relationship between the number of cores and number of threads. Adding more threads, as much as I wish it would, does not equate to better performance (actually, it can often hurt performance, because of things like context switching).
However, this entire answer is useless in a determining a final answer for your question, though I hope it will give you some needed direction. You won't know what performs better until you profile it. If you don't try them both (I should clarify that I mean the Task without the LongRunning option, letting the scheduler handle thread switching) and profile them to determine what is best for your particular use case, you're selling yourself short.
Both options are entirely inappropriate for your scenario.
TaskCreationOptions.LongRunning
is certainly a better choice for tasks that are not CPU-bound, as the TPL (Parallel
classes/extensions) are almost exclusively meant for maximizing the throughput of a CPU-bound operation by running it on multiple cores (not threads).
However, 1000 tasks is an unacceptable number for this. Whether or not they're all running at once isn't exactly the issue; even 100 threads waiting on synchronous I/O is an untenable situation. As one of the comments suggests, your application will be using an enormous amount of memory and end up spending almost all of its time in context-switching. The TPL is not designed for this scale.
If your operations are I/O bound - and if you are using web services, they are - then async I/O is not only the correct solution, it's the only solution. If you have to re-architect some of your code (such as, for example, adding asynchronous methods to major interfaces where there were none originally), do it, because I/O completion ports are the only mechanism in Windows or .NET that can properly support this particular type of concurrency.
I've never heard of a situation where async I/O was somehow "not an option". I cannot even conceive of any valid use case for this constraint. If you are unable to use async I/O then this would indicate a serious design problem that must be fixed, ASAP.
While this is not a direct comparison, I think it may help you. I do something similar to what you describe (in my case I know there is a load balanced server cluster on the other end serving REST calls). I get good results using Parrallel.ForEach
to spin up an optimal number of worker threads provided that I also use the following code to tell my operating system it can connect to more than usual number of endpoints.
var servicePointManager = System.Net.ServicePointManager.FindServicePoint(Uri);
servicePointManager.ConnectionLimit = 250;
Note you have to call that once for each unique URL you connect to.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With