Summary: I changed from System.Threading.Tasks.Parallel.ForEach and Concurrent Data structure to a simple plinq (Parallel Linq) query. The speed up was amazing.
So is plinq inherently faster than Parallel.ForEach? Or is it specific to the task.
// Original Code
// concurrent dictionary to store results
var resultDict = new ConcurrentDictionary<string, MyResultType>();
Parallel.ForEach(items, item =>
{
resultDict.TryAdd(item.Name, PerformWork(source));
});
// new code
var results =
items
.AsParallel()
.Select(item => new { item.Name, queryResult = PerformWork(item) })
.ToDictionary(kv => kv.SourceName, kv => kv.queryResult);
Notes: Each task (PerformWork) now runs between 0 and 200 ms. It used to take longer before I optimized it. That's why I was using the Tasks.Parallel library in the fist place. So I went from 2 seconds total time to ~100-200 ms total time, performing roughly the same work, just with different methods. (Wow linq and plinq are awesome!)
Questions:
Whereas PLINQ is largely based on a functional style of programming with no side-effects, side-effects are precisely what the TPL is for. If you want to actually do work in parallel as opposed to just searching/selecting things in parallel, you use the TPL.
Can I assume that because my pattern is basically functional (giving inputs produce new outputs without mutation), that plinq is the correct technology to use?
I'm looking for validation that my assumptions are correct, or an indication that I'm missing something.
Foreach loop in C# runs upon multiple threads and processing takes place in a parallel way. Which means it is looping through all items at once without waiting for the previous item to complete. The execution of Parallel. Foreach is faster than normal ForEach.
use the Parallel. ForEach method for the simplest use case, where you just need to perform an action for each item in the collection. use the PLINQ methods when you need to do more, e.g. query the collection or to stream the data.
Parallel. ForEach is like the foreach loop in C#, except the foreach loop runs on a single thread and processing take place sequentially, while the Parallel. ForEach loop runs on multiple threads and the processing takes place in a parallel manner.
No, it doesn't block and returns control immediately. The items to run in parallel are done on background threads.
It's not possible to use these 2 code samples to do a definitive comparison between Parallel.ForEach
and PLINQ. The code samples are simply too different.
The first item that jumps out at me is the first sample uses ConcurrentDictionary
and the second uses Dictionary
. These two types have very different uses and performance characteristics. In order to get an accurate comparison between the two technologies you need to be consistent here with the types.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With