Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Factors for determining partitionCount for C# Partitioner.GetPartitions()

Below is an implementation of ForEachAsync written by Stephen Toub

public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body) 
{ 
    return Task.WhenAll( 
        from partition in Partitioner.Create(source).GetPartitions(dop) 
        select Task.Run(async delegate { 
            using (partition) 
                while (partition.MoveNext()) 
                    await body(partition.Current); 
        })); 
}

What factors should be considered when specifying a partitionCount (dop in this case)?

Does the hardware make a difference (# of cores, available RAM, etc)?

Does the type of data/operation influence the count?

My first guess would be to set dop equal to Environment.ProcessorCount for general cases, but my gut tells me that's probably unrelated.

like image 656
Jim Buck Avatar asked Dec 18 '15 16:12

Jim Buck


1 Answers

Both hardware as well as operations executed matter a lot.

If you want to run CPU bound work that is not constrained in any other way you don't need to method at all. You're better off using Parallel or PLINQ which are made for that (and suck terribly at IO).

For IO there is no easy way to predict the best DOP. For example, magnetic disks like DOP 1. SSDs like 4-16(?). Web services could like pretty much any value. I could continue this list for dozens more factors including databases, lock contention etc.

You need to test different amounts in a testing environment. Then, use the best performing value.

Using Environment.ProcessorCount makes no sense with IO. When you add CPUs IO does not get faster.

like image 82
usr Avatar answered Sep 30 '22 12:09

usr