In which scenarios Will Range partitioning will be better choice than Chunk partitioning ? (and vise verse)
I already know that
Chunk partitioning : grabbing smal chunks of elements from the input to process , he starts with small chunks then , increases the chunk size.
Range partitioning preallocates an equal number of elements to each worker
Also , Why this code : ( finding prime numbers till 100000)
IEnumerable<int> numbers = Enumerable.Range (3, 100000-3);
var parallelQuery = from n in numbers.AsParallel()
where Enumerable.Range (2, (int) Math.Sqrt (n)).All (i => n % i > 0)
select n;
Might perform poorly with range partitioning ?
While this code : (finding the sum of sqrt of the first million numbers)
ParallelEnumerable.Range (1, 10000000).Sum (i => Math.Sqrt (i))
Will be better a better choice to use with range partitioning ?
In the first sample , the required tme per item depends on n
. Searching for the next prime number after 90000 takes more time than finding the one after 11.
As a result, when divided into equal ranges, the last ranges will have to perform much more work than the first.
In the second sample the time-per-operation is equal over the entire range. So range partitioning will work well.
A member of the PFX Team at Microsoft gives a good explanation on the PFX Dev Blog:
Partitioning in PLINQ
Based on many factors, we have 4 primary algorithms that we use for partitioning alone. They’re worth getting to know, because we’ll talk more about them and tweaks that we make to them in future technology discussions.
- Range Partitioning – This is a pretty common partitioning scheme, similar to the one that I described in the example above. This is amenable to many query shapes, though it only works with indexible data sources such as lists and arrays (i.e. IList and T[]). If you give PLINQ something typed as IEnumerable or IEnumerable, PLINQ will query for the ILIst interface, and if it’s found, will use that interface implementation with range partitioning. The benefits of these data sources is that we know the exact length and can access any of the elements within the arrays directly. For the majority of cases, there are large performance benefits to using this type of partitioning.
- Chunk Partitioning – This is a general purpose partitioning scheme that works for any data source, and is the main partitioning type for non-indexible data sources. In this scheme, worker threads request data, and it is served up to the thread in chunks. IEnumerables and IEnumerables do not have fixed Count properties (there is a LINQ extension method for this, but that is not the same), so there’s no way to know when or if the data source will enumerate completely. It could be 3 elements, it could be 3 million elements, it could be infinite. A single system needs to take all of these possibilities into account and factor in different delegate sizes, uneven delegate sizes, selectivity etc. The chunk partitioning algorithm is quite general and PLINQ’s algorithm had to be tuned for good performance on a wide range of queries. We’ve experimented with many different growth patterns and currently use a plan that doubles after a certain number of requests. This is subject to change as we tune for performance, so don’t depend on this. Another important optimization is that chunk partitioning balances the load among cores, as the tasks per core dynamically request more work as needed. This ensures that all cores are utilized throughout the query and can all cross the finish line at the same time vs. a ragged, sequential entry to the end.
- Striped Partitioning – This scheme is used for SkipWhile and TakeWhile and is optimized for processing items at the head of a data source (which obviously suits the needs of SkipWhile and TakeWhile). In striped partitioning, each of the n worker threads is allocated a small number of items (sometimes 1) from each block of n items. The set of items belonging to a single thread is called a ‘stripe’, hence the name. A useful feature of this scheme is that there is no inter-thread synchronization required as each worker thread can determine its data via simple arithmetic. This is really a special case of range partitioning and only works on arrays and types that implement IList.
- Hash Partitioning – Hash partitioning is a special type of partitioning that is used by the query operators that must compare data elements (these operators are: Join, GroupJoin, GroupBy, Distinct, Except, Union, Intersect). When hash-partitioning occurs (which is just prior to any of the operators mentioned), all of the data is processed and channeled to threads such that items with identical hash-codes will be handled by the same thread. This hash-partitioning work is costly, but it means that all the comparison work can then be performed without further synchronization. Hash partitioning assigns every element to an output partition based on a hash computed from each element’s key. This can be an efficient way of building a hash-table on the fly concurrently, and can be used to accomplish partitioning and hashing for the hash join algorithm. The benefit is that PLINQ can now use the same hash partitioning scheme for the data source used for probing; this way all possible matches end up in the same partition, meaning less shared data and smaller hash table sizes (each partition has its own hash table). There’s a lot going on with hash-partitioning, so it’s not as speedy as the other types, especially when ordering is involved in the query. As a result the query operators that rely upon it have additional overheads compared to simpler operators.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With