How to optimize quicksort

Tags:

I am trying to work out an efficient quicksort algo. It works okay, but takes long time to run when the number of elements are huge, and certain sections of the array are pre-sorted. I was looking up the Wikipedia article on quicksort, and there I found this written:

To make sure at most O(log N) space is used, recurse first into the smaller half of the array, and use a tail call to recurse into the other.

Use insertion sort, which has a smaller constant factor and is thus faster on small arrays, for invocations on such small arrays (i.e. where the length is less than a threshold t determined experimentally). This can be implemented by leaving such arrays unsorted and running a single insertion sort pass at the end, because insertion sort handles nearly sorted arrays efficiently. A separate insertion sort of each small segment as they are identified adds the overhead of starting and stopping many small sorts, but avoids wasting effort comparing keys across the many segment boundaries, which keys will be in order due to the workings of the quicksort process. It also improves the cache use.

I am currently recursing for both partitions. Any idea how to implement the first tip? What is meant by recurse first into the smaller half of the array, and use a tail call to recurse into the other? And secondly, how can I implement an insertion-sort within quicksort? Will it always improve the efficiency, or only when certain sections of the array are pre-sorted? If it is the 2nd case, then of course I have no way of knowing when will that occur. So when should I include the insertion-sort?

272

asked Sep 17 '12 07:09

SexyBeast

3 Answers

In Quick-sort , you choose a random pivot that delimits the array to two half's, most of the chances that one may be smaller,

e.g. Array size 100, pivot delimits the array to 40 / 60, the 40 is the the smaller size.

Lets assume that you decide on your threshold size to use the insertion sort to be 10, you need to continue recursively split the array by pivot , whenever one of the half's become smaller or equal to 10, you may use the insertion sort that behaves like O(n) on small size arrays.

Take into account that insertion sort will behave badly if your array is sorted reversely (worst case).

As regards to the recursion stuff, you only need to modify the stop case of the quick-sort recursion -> array size <= 10 stop recursion and sort the all the array (which is much smaller in this recursion step) using insertion sort.

By tail recursion , they mean do everything you need with the first half, and then invoke insertion sort for the smaller half as a last method , it is used to save space.

Click to copy

  Quick-sort()
      choose a pivot
      move the smaller elements from left
      move the bigger elements from right
      quick-sort on the bigger half of the array

      if half is less then X
         only then do an insertion sort on the other half <- this is a tail recursion insertion sort 
      else
         quick sort on this half also

As far as i see , the second optimization suggest not to use insertion sort for every recursion step, but remember the indexes for which the constraint is made, then to invoke insertion sort in one batch concatenating the items from all the slices, this will insure improve the cache use , but is is slightly more difficult to implement,

answered Oct 10 '22 03:10

Michael

There are multiple ways one can make standard quicksort more efficent. To implement the first tip from your post you should write something like:

Click to copy

void quicksort(int * tab, int l, int r)
{
   int q;
   while(l < r)
   {
      q = partition(tab, l, r);
      if(q - l < r - q) //recurse into the smaller half
      {
         quicksort(tab, l, q - 1);
         l = q + 1;
      } else
      {
         quicksort(tab, q + 1, r);
         r = q - 1;
      }
   }
}

Hope that's clear enough. Next step would be to implement your own stack (or use some built-in from whatever language you are using) instead of using recursive calls. Example (pseudo)code:

Click to copy

void quicksort2(int * tab, int l, int r)
{
    int le, ri, q;
    init stack;
    push(l, r, stack);
    while(!empty(stack))
    {
        //take the top pair of values from the stack and set them to le and ri
        pop(le, ri, stack);
        if(le >= ri)
            continue;
        q = partition(tab, le, ri);
        if(q - le < ri - q) //smaller half goes first
        {
            push(le, q - 1, stack);
            push(q + 1, ri, stack);
        } else
        {
            push(q + 1, ri, stack);
            push(le, q - 1, stack);
        }
    }
    delete stack;
}

Then you can proceed to implement the other tip from your post. To do this you should set some arbitrary constant, lets call it CUT_OFF, to around 20. This will tell your algorithm when it should switch to insertion sort. It should be rather easy (a matter of adding one if-statement) to alter the previous example so that it switches to insertion sort after it's reached a CUT_OFF point so I will leave you to that.

As for partition method I would recommend using the Lomuto partition instead of Hoare.

However, if your data is already pre-sorted, then you could consider using a different algorithm altogether. From my experience, natural series merge sort implemented on a linked list is a very good choice, if your data is pre-sorted.

answered Oct 10 '22 03:10

tu_ru

I wrote some time ago a quicksort-based algorithm that you can find there (actually it is a selection algorithm, but can be used a sort algorithm too):

http://processors.wiki.ti.com/index.php/Top_K_Elements_Sort_For_DSP_With_Index_Ordering

The lessons I learned from this experience are the following:

Carefully tune the partition loop of your algorithm. This is often underestimated, but you do get a significant performance boost if you take care of writing loops that the compiler/CPU will be able to software pipeline. This alone has lead to a win of about 50% in CPU cyles.
Hand-coding small sorts gives you a major peformance win. When the number of elements to be sorted in a partition is under 8 elements, just don't bother trying to recurse, but instead implement a hard-coded sort using just ifs and swaps (have a look at the fast_small_sort function in this code). This can lead to a win of about 50% in CPU cycles giving the quicksort the same practical performance as a well written "merge sort".
Spend time to pick a better pivot value when a "poor" pivot selection is detected. My implementation starts using a "median of median" algorithm for pivot selection whenever a pivot selection leads to one side being under 16% of the remaining elements to be sorted. This is a mitigation strategy for worst-case performance of quick-sort, and help ensure that in practice the upper bound is also O(n*log(n)) instead of O(n^2).
Optimize for arrays with lots of equal values (when needed). If the arrays to be sorted have lots of equal values it is worth optimizing for as it will lead to poor pivot selection. In my code I do it by counting all the array entries that are equal to the pivot value. This enables me treat the pivot and all the equal values in the array in a faster way, and doesn't degrade performance when it is not applicable. This is another mitigation strategy for worst-case performance, it helps reduce the worst-case stack usage by reducing the max recursion level in a drastic way.

I hope this helps, Laurent.

answered Oct 10 '22 01:10

Laurent

Related questions
                            
                                Getting actual screen dpi/ppi under windows
                            
                                Getting the raw Request in ASP.NET MVC
                            
                                TypeScript: void return type converted to any type?
                            
                                Private static properties in TypeScript
                            
                                Don't understand "assuming signed overflow" warning
                            
                                Amazon Simple Notification Service AWSSDK C# - S.O.S
                            
                                Throwing custom exceptions in Javascript. Which style to use?
                            
                                What's the difference between web audio and html5 audio anyway?
                            
                                Compare byte values?
                            
                                What are the differences between MKNetworkKit and AFNetworking?
                            
                                Eclipse stops console from clearing
                            
                                What is the purpose of ValidationResult.Success field?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to optimize quicksort

Tags:

algorithm

recursion

quicksort

insertion-sort

SexyBeast

People also ask

3 Answers

Michael

tu_ru

Laurent

Recent Activity

Donate For Us