Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

QuickSort vs MergeSort on arrays of primitives in Java

I know that Java's Arrays.sort method uses MergeSort for sorting arrays of objects (or collections of objects) since it is stable, and Java uses QuickSort for arrays of primitives because we don't need stability since two equal ints are indistinguishable, i.e. their identity doesn't matter.

My question is, in the case of primitives, why doesn't Java use MergeSort's guaranteed O(n log n) time and instead goes for the average O(n log n) time of QuickSort? In the last paragraph of one of the related answers here, it is explained that:

For reference types, where the referred objects usually take up far more memory than the array of references, this generally does not matter. But for primitive types, cloning the array outright doubles the memory usage.

What does this mean? Cloning a reference is still at-least as costly as cloning a primitive. Are there any other reasons for using QuickSort (average O(n log n)) instead of MergeSort (guaranteed O(n log n) time) on arrays of primitives?

like image 208
qrius Avatar asked Mar 13 '17 01:03

qrius


People also ask

Which is faster Merge sort or quicksort?

Merge sort is more efficient and works faster than quick sort in case of larger array size or datasets. Quick sort is more efficient and works faster than merge sort in case of smaller array size or datasets.

Why is quicksort better than Merge sort?

Quick sort is an in-place sorting algorithm. In-place sorting means no additional storage space is needed to perform sorting. Merge sort requires a temporary array to merge the sorted arrays and hence it is not in-place giving Quick sort the advantage of space.

What is the difference between Merge sort and quicksort?

Merge sort is an external sorting method in which the data that is to be sorted can be stored outside the memory and is loaded in small chunks into the memory for sorting. Quicksort is an internal sorting method, where the data that is to be sorted needs to be stored in the main memory throughout the sorting process.

Why is quick sort better for arrays?

Quick Sort is also a cache friendly sorting algorithm as it has good locality of reference when used for arrays. Quick Sort is also tail recursive, therefore tail call optimizations is done.


2 Answers

Not all O(n log n) algorithms have the same constant factors. Quicksort, in the 99.9% of cases where it takes n log n time, runs in a much faster n log n than mergesort. I don't know the exact multiplier -- and it'll vary system to system -- but, say, quicksort could run twice as fast as merge sort on average and still have theoretical worst case n^2 performance.

Additionally, Quicksort doesn't require cloning the array in the first place, and merge sort inevitably does. But you don't have a choice for reference types if you want a stable sort, so you have to accept the copy, but you don't need to accept that cost for primitives.

like image 107
Louis Wasserman Avatar answered Oct 06 '22 22:10

Louis Wasserman


Cloning a reference is still at-least as costly as cloning a primitive.

Most (or all?) implementations of Java implement an array of objects as an array of pointers (references) to objects. So cloning an array of pointers (references) would consume less space than cloning the objects themselves if the objects are larger in size than a pointer (reference).

I don't know why the term "cloning" was used. Merge sort allocates a second temp array, but the array is not a "clone" of the original. Instead an proper merge sort alternates the direction of merge from original to temp or from temp to original depending on iteration for bottom up, or depending on level of recursion for top down.

dual pivot quick sort

Based on what I can find doing web searches, Java's dual pivot quicksort keeps track of "recursions", and switches to heap sort if the recursion depth is excessive, to maintain O(n log(n)) time complexity, but at a higher cost factor.

quick sort versus merge sort

In addition to stability, merge sort can be faster for sorting an array of pointers (references) to objects. Merge sort does more moves (of the pointers) but fewer compares (of the objects accessed by dereferencing pointers), than quick sort.

On a system with 16 registers (most of them used as pointers), such as X86 in 64 bit mode, a 4-way merge sort is about as fast as regular quick sort, but I don't recall seeing a 4-way merge sort in a common library, at least not for a PC.

like image 37
rcgldr Avatar answered Oct 06 '22 23:10

rcgldr