Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel.ForEach losing data

Parallel.ForEach helps improve performance however, I am seeing data loss.

Tried - variables results, processedData are ConcurrentBag<IwrRows>

1)

Parallel.ForEach(results, () => new ConcurrentBag<IwrRows>(), (n, loopState, localData)    =>
{
 return ProcessData(n); // ProcessData complicated business logic
}, (localData) => AddRows(localData, processedData, obj)
);

2)

await Task.Run(() => Parallel.ForEach(results, item =>
        {
            ProcessData(item, processedData);  
        }));

3)

Parallel.ForEach(results, item =>
 {
 ProcessData(item, processedData);
 });

All of them lost some rows.

When I use the foreach block it returns consistently the same value however, its 4 times slower.

foreach (var item in results)
        {
            // ProcessData returns a List<IwrRows>
            processedData.AddRange(ProcessData(item));
        }

Not sure what I am missing here.

results - 51112 Foreach returns 41316 rows back. ForeachParallel returns 41308 or 41313 or 41314 varies with each run

like image 206
Karthik Giddu Avatar asked Mar 18 '16 13:03

Karthik Giddu


2 Answers

You seem to struggle with the results and getting them back into a coherent list. You could use PLinQ, so you don't have to bother with the results container being thread-safe:

var processedData = yourData.AsParallel().Select(ProcessData).ToList();
like image 54
nvoigt Avatar answered Sep 28 '22 07:09

nvoigt


You problem seems to be in: AddRows(localData, processedData, obj). This method is probably adding data to a list which is not thread safe. You should add the to a thread safe list or make some synchronizing around the adding of the data.

like image 27
Peter Avatar answered Sep 28 '22 07:09

Peter