What is the fundamental difference between quicksort and tuned quicksort? What is the improvement given to quicksort? How does Java decide to use this instead of merge sort?
As Bill the Lizard said, a tuned quicksort still has the same complexity as the basic quicksort - O(N log N) average complexity - but a tuned quicksort uses some various means to try to avoid the O(N^2) worst case complexity as well as uses some optimizations to reduce the constant that goes in front of the N log N for average running time.
Worst Case Time Complexity
Worst case time complexity occurs for quicksort when one side of the partition at each step always has zero elements. Near worst case time complexity occurs when the ratio of the elements in one partition to the other partition is very far from 1:1 (10000:1 for instance). Common causes of this worst case complexity include, but are not limited to:
A quicksort algorithm that always chooses the element with the same relative index of a subarray as the pivot. For instance, with an array that is already sorted, a quicksort algorithm that always chooses the leftmost or rightmost element of the subarray as the pivot will be O(N^2). A quicksort algorithm that always chooses the middle element gives O(N^2) for the organ pipe array ([1,2,3,4,5,4,3,2,1] is an example of this).
A quicksort algorithm that doesn't handle repeated/duplicate elements in the array can be O(N^2). The obvious example is sorting an array that contains all the same elements. Explicitly, if the quicksort sorts the array into partitions like [ < p | >= p ], then the left partition will always have zero elements.
How are these remedied? The first is generally remedied by choosing the pivot randomly. Using a median of a few elements as the pivot can also help, but the probability of the sort being O(N^2) is higher than using a random pivot. Of course, the median of a few randomly chosen elements might be a wise choice too. The median of three randomly chosen elements as the pivot is a common choice here.
The second case, repeated elements, is usually solved with something like Bentley-McIlroy paritioning(links to a pdf) or the solution to the Dutch National Flag problem. The Bentley-McIlroy partitioning is more commonly used, however, because it is usually faster. I've come up with a method that is faster than it, but that's not the point of this post.
Optimizations
Here are some common optimizations outside of the methods listed above to help with worst case scenarios:
Using the converging pointers quicksort as opposed to the basic quicksort. Let me know if you want more elaboration on this.
Insertion sort subarrays when they get below a certain size. Insertion sort is asymptotically O(N^2), but for small enough N, it beats quicksort.
Using an iterative quicksort with an explicit stack as opposed to a recursive quicksort.
Unrolling parts of loops to reduce the number of comparisons.
Copying the pivot to a register and using that space in the array to reduce the time cost of swapping elements.
Other Notes
Java uses mergesort when sorting objects because it is a stable sort (the order of elements that have the same key is preserved). Quicksort can be stable or unstable, but the stable version is slower than the unstable version.
"Tuned" quicksort just means that some improvements are applied to the basic algorithm. Usually the improvements are to try and avoid worst case time complexity. Some examples of improvements might be to choose the pivot (or multiple pivots) so that there's never only 1 key in a partition, or only make the recursive call when a partition is above a certain minimum size.
It looks like Java only uses merge sort when sorting Objects (the Arrays doc tells you which sorting algorithm is used for which sort method signature), so I don't think it ever really "decides" on its own, but the decision was made in advance. (Also, implementers are free to use another sort, as long as it's stable.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With