Background
I have a piece of code which is highly parallelizable and I found that most of the time I am using only one core at 100% while the rest does nothing. To tackle this I've tinkered with multithreading, implementing semaphores and what not to realize that Parallel.For() is fine grained and more efficient than any of my solutions.
The Code
To simplify I will only write the pieces of code structurally important.
int sharedResource = 0;
for (int i = 0; i < someMax; i++)
{
for (int j = 0; j <= i; j++)
{
if (someCondition(i, j))
sharedResource += someFunction(i, j);
else break;
}
}
All of the ambiguously named functions are more or less just mathematical equations and are of time complexity O(1).
Important Details
Pay attention to the inner loop that has the variable i as upper boundary as well as the summation variable named sharedResource. The execution order in this case is not important as addition is commutative and I don't see any obvious reason for Amdahl's Law to apply as all instance combinations (i, j) of both loops can be calculated independently.
Question
Is it smart to use a nested Parallel.For() loop in this scenario or should I only use it instead of the outer loop (or only on the inner respectively)?
The only thing that concerns me is the sharedResource as I don't get a lot of in-depth insight of how Parallel.For() works from the documentation. Another important thing is that if I do use two Parallel.For() loops some instances will finish almost instantly due to the break while others will take much more time. Will it be able to balance this?
Whether to use nested parallel loops, parallelize only inner or only outer loop, depends a lot on nature of your data. Nested parallel loops are designed to work reasonably well. For example, if both outer and inner loops has both degree of parallelism of 8 for example - it does not mean that when nested they will process items on 8x8=64 threads, as one might think when looking naively at this.
You should measure perfomance of all options on your specific data set and figure out what works best for you.
Note that Parallel.For loop partitions interval in certain number of ranges (depending on degree of parallelism) and then those ranges are executed in parallel on separate threads. What that means is: if processing time of your items is distributed non-evenly - some ranges might complete much faster than others. Say you run with degree of parallelism 4, and processing 100 items, of which first 75 return false for someCondition and so take 0 time to execute, while last 25 return true. In result, first 3 ranges will complete immediately and last range with all real work will execute on one thread, essentially making whole thing sequential.
If uneven distribution is expected, you might use Parallel.ForEach with "real" IEnumerable instead (by real I mean it's not array or list but real "lazy" IEnumerable):
Parallel.ForEach(Enumerable.Range(0, i), j => {...})
But note that on evenly distributed data it will be slower than pre-partitioned version.
Nested Parallel.For might also help if run-time is unevenly distributed, but again - you have to measure each option on your real data and choose the best one.
As for thread safety. Of course, this
sharedResource += someFunction(i, j);
is not thread safe inside parallel loops. Using lock here might degrade perfomance a lot if someFunction is fast, and not necessary anyway. Either just use
Interlocked.Add(ref sharedResource, someFunction(i, j))
Or you can use overloads of Parallel.For`Parallel.ForEach` which allow accumulation of values per each running thread, and then aggregation of results. For example:
Parallel.For(0, 100, (i, outerState) =>
{
Parallel.ForEach(Enumerable.Range(0, i), () => 0, (j, innerState, subTotal) =>
{
if (someCondition(i, j))
return subTotal + someFunction(i, j);
else {
innerState.Break();
return subTotal;
}
}, subTotalOfThread => Interlocked.Add(ref sharedResource, subTotalOfThread));
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With