Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Quicksort slower than Mergesort?

Tags:

I was working on implementing a quicksort yesterday, and then I ran it, expecting a faster runtime than the Mergesort (which I had also implemented). I ran the two, and while the quicksort was faster for smaller data sets <100 elements (and I did verify that it works), the mergesort became the quicker algorithm fairly quickly. I had been taught that quicksort is almost always "quicker" than mergesort, and I understand that there is some debate on this topic, but I at least expected it to be closer than this. For data sets >10000 elements, the mergesort was over 4 times faster. Is this to be expected, or is there an error in my quicksort code?

mergesort:

public static void mergeSort(int[ ] e) {     if (e.length <= 1) return;     int[] first = new int[e.length/2];     int[] second = new int[e.length - first.length];     System.arraycopy(e, 0, first, 0, first.length);     System.arraycopy(e, first.length, second, 0, second.length);     mergeSort(first);     mergeSort(second);     System.arraycopy(merge(first, second), 0, e, 0, e.length); }  private static int[] merge(int[] first, int[] second) {     int iFirst = 0;     int iSecond = 0;     int iCombined = 0;      int[] combined = new int[first.length + second.length];     while(iFirst < first.length && iSecond < second.length) {         if (first[iFirst] > second[iSecond]) {             combined[iCombined++] = second[iSecond++];         }         else combined[iCombined++] = first[iFirst++];     }     for(; iFirst < first.length; iFirst++) {         combined[iCombined++] = first[iFirst];     }     for(; iSecond < second.length; iSecond++) {         combined[iCombined++] = second[iSecond];     }     return combined; } 

quicksort:

public static void quicksort(int[] a, int first, int last) {     if (first >= last) return;      int partitionIndex = partition(a, first, last);     quicksort(a, first, partitionIndex - 1);     quicksort(a, partitionIndex + 1, last); }  public static int partition(int[] x, int first, int last) {     int left = first;     int right = last;     int pivot = x[first];     int pivotIdx = first;      while(left <= right) {         while(left < x.length && x[left] <= pivot) left++;         while(right >= 0 && x[right] > pivot) right--;         if (left <= right) {             int temp = x[left];             x[left] = x[right];             x[right] = temp;         }     }     pivotIdx = right;     x[first] = x[right];     x[pivotIdx] = pivot;     return pivotIdx; } 
like image 779
John Murano Avatar asked Jan 31 '09 00:01

John Murano


People also ask

Why is quicksort faster than mergesort?

Quick sort is an in-place sorting algorithm. In-place sorting means no additional storage space is needed to perform sorting. Merge sort requires a temporary array to merge the sorted arrays and hence it is not in-place giving Quick sort the advantage of space.

Does quicksort make more comparisons than Mergesort?

Quicksort is NOT better than mergesort. With O(n^2) (worst case that rarely happens), quicksort is potentially far slower than the O(nlogn) of the merge sort. Quicksort has less overhead, so with small n and slow computers, it is better.

Is Timsort faster than mergesort?

Thus a properly implemented timsort is faster on average than a pure merge sort. Timsort is furthermore optimized to deal well with real-world data. Real-world data is not randomly distributed: it's common to have sorted runs in the data to be sorted.

How do I make my Mergesort faster?

Use insertion sort for small subarrays. We can improve most recursive algorithms by handling small cases differently. Switching to insertion sort for small subarrays will improve the running time of a typical mergesort implementation by 10 to 15 percent. Test whether array is already in order.


2 Answers

I actually just wrote a "linked-list comparative sort demo program" in C and arrived at a similar conclusion (that mergesort will beat quicksort for most uses), altho I have been told that quicksort is generally not used for linked lists anyway. I would note that the choice of pivot values is a monster factor -- my initial version used a random node as the pivot, and when I refined it a bit to take a mean of two (random) nodes, the exectution time for 1000000 records went from over 4 minutes to less than 10 seconds, putting it right on par with mergesort.

Mergesort and quicksort have the same big O best case (n*log(n)) and despite what people may try to claim, big O is really about iteration count and not comparison count. The biggest difference that can be produced between the two of them will always be to quicksort's detriment, and it involves lists that are already largely sorted or contain a large number of ties (when quicksort does better than mergesort, the difference will not be nearly so great). This is because ties or already sorted segments streamline straight through mergesort; when two split lists come back to be merged, if one list already contains all smaller values, all of the values on the left will be compared one at a time to the first element of the right, and then (since the returned lists have an internal order) no further comparisons need be done and the right is simply iterated onto the end. This is to say, the number of iterations will remain constant, but the number of comparisons is cut in half. If you are talking about actual time and are sorting strings, it's the comparisons that are expensive.

Ties and already sorted segments in quicksort can easily lead to unbalanced lists if the pivot value is not carefully determined, and the imbalanced lists (eg, one on the right, ten on the left) are what causes the slowdown. So, if you can get your quicksort to perform as well on an already sorted list as it does on a ramdomized list, you've got a good method for finding the pivot.

If you're interested, the demo program produces output like this:

[root~/C] ./a.out -1 3  Using "", 0 records Primary Criteria offset=128  Command (h for help, Q to quit): N How many records? 4000000 New list is 562500.00 kb  Command (h for help, Q to quit): m  Mergesorting..............3999999 function calls 123539969 Iterations     Comparison calls: 82696100 Elapsed time: 0 min 9 sec   Command (h for help, Q to quit): S Shuffled.  Command (h for help, Q to quit): q  Quicksorting..............4000000 function calls 190179315 Iterations     Comparison calls: 100817020 Elapsed time: 0 min 23 sec 

Altho without the krazy kolors. There's some more stuff about it by me about halfway down this page.

ps. neither sort requires extra memory with the linked list.

like image 131
mk27 Avatar answered Nov 24 '22 01:11

mk27


Mergesort is a lot slower for random array based data, as long as it fits in ram. This is the first time I see it debated.

  • qsort the shortest subarray first.
  • switch to insertion sort below 5-25 elements
  • do a normal pivot selection

Your qsort is very slow because it tries to partition and qsort arrays of length 2 and 3.

like image 21
Stephan Eggermont Avatar answered Nov 23 '22 23:11

Stephan Eggermont