Was going through Java 8
features, mentioned here. Couldn't understand what parallelSort()
does exactly. Can someone explain what is the actual difference between sort()
and parallelSort()
?
Arrays package that use the JSR 166 Fork/Join parallelism common pool to provide sorting of arrays in parallel. The methods are called parallelSort() and are overloaded for all the primitive data types and Comparable objects. The following table contains Arrays overloaded sorting methods.
Parallel Sort uses Fork/Join framework introduced in Java 7 to assign the sorting tasks to multiple threads available in the thread pool. Fork/Join implements a work stealing algorithm where in a idle thread can steal tasks queued up in another thread.
Collections. sort() Operates on List Whereas Arrays. sort() Operates on an Array. Arrays.
As mentioned in the official JavaDoc, Arrays. sort uses dual-pivot Quicksort on primitives. It offers O(n log(n)) performance and is typically faster than traditional (one-pivot) Quicksort implementations. However, it uses a stable, adaptive, iterative implementation of mergesort algorithm for Array of Objects.
Parallel sort uses threading - each thread gets a chunk of the list and all the chunks are sorted it in parallel. These sorted chunks are then merged into a result.
It's faster when there are a lot of elements in the collection. The overhead for parallelization (splitting into chunks and merging) becomes tolerably small on larger collections, but it is large for smaller ones.
Take a look at this table (of course, the results depend on the CPU, number of cores, background processes, etc):
Taken from this link: http://www.javacodegeeks.com/2013/04/arrays-sort-versus-arrays-parallelsort.html
Arrays.parallelSort() :
The method uses a threshold value and any array of size lesser than the threshold value is sorted using the Arrays#sort() API (i.e sequential sorting). And the threshold is calculated considering the parallelism of the machine, size of the array and is calculated as:
private static final int getSplitThreshold(int n) { int p = ForkJoinPool.getCommonPoolParallelism(); int t = (p > 1) ? (1 + n / (p << 3)) : n; return t < MIN_ARRAY_SORT_GRAN ? MIN_ARRAY_SORT_GRAN : t; }
Once its decided whether to sort the array in parallel or in serial, its now to decide how to divide the array in to multiple parts and then assign each part to a Fork/Join task which will take care of sorting it and then another Fork/Join task which will take care of merging the sorted arrays. The implementation in JDK 8 uses this approach:
Divide the array into 4 parts.
Sort the first two parts and then merge them.
Sort the next two parts and then merge them. And the above steps are repeated recursively with each part until the size of the part to sort is not lesser than the threshold value calculated above.
You can also read the implementation details in the Javadoc
The sorting algorithm is a parallel sort-merge that breaks the array into sub-arrays that are themselves sorted and then merged. When the sub-array length reaches a minimum granularity, the sub-array is sorted using the appropriate Arrays.sort method. If the length of the specified array is less than the minimum granularity, then it is sorted using the appropriate Arrays.sort method. The algorithm requires a working space no greater than the size of the specified range of the original array. The ForkJoin common pool is used to execute any parallel tasks.
Array.sort():
This uses merge sort OR Tim Sort underneath to sort the contents. This is all done sequentially, even though merge sort uses divide and conquer technique, its all done sequentially.
Source
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With