I just started using the TPL, and I want to make several calls to web services happen in parallel. From what I can gather, I see two ways of doing this.
Either Parallel.ForEach
:
List<ServiceMemberBase> list = new List<ServiceMemberBase>(); //Take list from somewhere.
Parallel.ForEach(list, member =>
{
var result = Proxy.Invoke(member);
//...
//Do stuff with the result
//...
});
Or Task<T>
:
List<ServiceMemberBase> list = new List<ServiceMemberBase>(); //Take list from somewhere.
ForEach(var member in list)
{
Task<MemberResult>.Factory.StartNew(() => proxy.Invoke(member));
}
//Wait for all tasks to finish.
//Process the result objects.
Disregarding if the syntax is correct or not, are these to equivilant?
Will they produce the same result? If not, why? and which is preferable?
The Task Parallel Library (TPL) is a set of public types and APIs in the System. Threading and System. Threading. Tasks namespaces. The purpose of the TPL is to make developers more productive by simplifying the process of adding parallelism and concurrency to applications.
Task parallelism (also known as function parallelism and control parallelism) is a form of parallelization of computer code across multiple processors in parallel computing environments. Task parallelism focuses on distributing tasks—concurrently performed by processes or threads—across different processors.
If you have several tasks that can be run in parallel, but still need to wait for all the tasks to end, you can easily achieve this using the Task. WhenAll() method in . NET Core. This will upload the first file, then the next file.
The Parallel class provides library-based data parallel replacements for common operations such as for loops, for each loops, and execution of a set of statements.
For the code and use case you discuss, the two approaches are essentially equivalent.
Parallel.ForEach is useful when you have to partition an input range over several tasks (not applicable here), or is easier to synchronize the merging of results of several independent parallel operations (perhaps applicable here?).
In any case, you've correctly noted that in the Parallel.ForEach case, you don't have to manually synchronize the wait for completion, whereas if you manually start tasks, you do have to manage that synchronization yourself. In this case you would probably use something like Task.WaitAll(...)
.
Between the two pieces of code, Parallel.ForEach()
will be more efficient, because it processes multiple items in a single Task
, one after another.
But both of them will use as many threads as the ThreadPool
will let them, which is not a good idea in this case. That's because the ThreadPool
is good at guessing the optimal number of threads if you have very short, CPU-bound Task
s, which is far from the case here.
Because of that, I think the best option is to manually limit the degree of parallelism to a small number (you would have to measure to find out what number gives the best results):
List<ServiceMemberBase> list = …; //Take list from somewhere.
Parallel.ForEach(list, new ParallelOptions { MaxDegreeOfParallelism = 10 },
member =>
{
var result = Proxy.Invoke(member);
//...
//Do stuff with the result
//...
});
Even more efficient would be if you could execute the web service call asynchronously. Doing that and limiting the degree of parallelism at the same time is not very easy, unless you're on C# 5. If you were on C# 5 and if you also updated Proxy
to support the Task-based Asynchronous Pattern (TAP), you could use TPL Dataflow to execute your code even more efficiently:
var actionBlock = new ActionBlock<ServiceMemberBase>(
async member =>
{
var result = await Proxy.InvokeAsync(member);
//...
//Do stuff with the result
//...
}
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 10 });
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With