Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between quicksort and tuned quicksort?

What is the fundamental difference between quicksort and tuned quicksort? What is the improvement given to quicksort? How does Java decide to use this instead of merge sort?

like image 591
unj2 Avatar asked May 05 '10 19:05

unj2


2 Answers

As Bill the Lizard said, a tuned quicksort still has the same complexity as the basic quicksort - O(N log N) average complexity - but a tuned quicksort uses some various means to try to avoid the O(N^2) worst case complexity as well as uses some optimizations to reduce the constant that goes in front of the N log N for average running time.

Worst Case Time Complexity

Worst case time complexity occurs for quicksort when one side of the partition at each step always has zero elements. Near worst case time complexity occurs when the ratio of the elements in one partition to the other partition is very far from 1:1 (10000:1 for instance). Common causes of this worst case complexity include, but are not limited to:

  1. A quicksort algorithm that always chooses the element with the same relative index of a subarray as the pivot. For instance, with an array that is already sorted, a quicksort algorithm that always chooses the leftmost or rightmost element of the subarray as the pivot will be O(N^2). A quicksort algorithm that always chooses the middle element gives O(N^2) for the organ pipe array ([1,2,3,4,5,4,3,2,1] is an example of this).

  2. A quicksort algorithm that doesn't handle repeated/duplicate elements in the array can be O(N^2). The obvious example is sorting an array that contains all the same elements. Explicitly, if the quicksort sorts the array into partitions like [ < p | >= p ], then the left partition will always have zero elements.

How are these remedied? The first is generally remedied by choosing the pivot randomly. Using a median of a few elements as the pivot can also help, but the probability of the sort being O(N^2) is higher than using a random pivot. Of course, the median of a few randomly chosen elements might be a wise choice too. The median of three randomly chosen elements as the pivot is a common choice here.

The second case, repeated elements, is usually solved with something like Bentley-McIlroy paritioning(links to a pdf) or the solution to the Dutch National Flag problem. The Bentley-McIlroy partitioning is more commonly used, however, because it is usually faster. I've come up with a method that is faster than it, but that's not the point of this post.

Optimizations

Here are some common optimizations outside of the methods listed above to help with worst case scenarios:

  1. Using the converging pointers quicksort as opposed to the basic quicksort. Let me know if you want more elaboration on this.

  2. Insertion sort subarrays when they get below a certain size. Insertion sort is asymptotically O(N^2), but for small enough N, it beats quicksort.

  3. Using an iterative quicksort with an explicit stack as opposed to a recursive quicksort.

  4. Unrolling parts of loops to reduce the number of comparisons.

  5. Copying the pivot to a register and using that space in the array to reduce the time cost of swapping elements.

Other Notes

Java uses mergesort when sorting objects because it is a stable sort (the order of elements that have the same key is preserved). Quicksort can be stable or unstable, but the stable version is slower than the unstable version.

like image 148
Justin Peel Avatar answered Nov 15 '22 18:11

Justin Peel


"Tuned" quicksort just means that some improvements are applied to the basic algorithm. Usually the improvements are to try and avoid worst case time complexity. Some examples of improvements might be to choose the pivot (or multiple pivots) so that there's never only 1 key in a partition, or only make the recursive call when a partition is above a certain minimum size.

It looks like Java only uses merge sort when sorting Objects (the Arrays doc tells you which sorting algorithm is used for which sort method signature), so I don't think it ever really "decides" on its own, but the decision was made in advance. (Also, implementers are free to use another sort, as long as it's stable.)

like image 4
Bill the Lizard Avatar answered Nov 15 '22 18:11

Bill the Lizard