Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Java's Arrays.sort method use two different sorting algorithms for different types?

People also ask

Why do we need different types of sorting algorithms?

Besides educational reasons, We need multiple sorting algorithms because they work best in a couple of situations, and none of them rules them all. For example, although the mean time-complexity of quicksort is impressive, its performance on nearly sorted array is horrible.

What are the two different methods for sorting?

There are two broad types of sorting algorithms: integer sorts and comparison sorts. Comparison sorts compare elements at each step of the algorithm to determine if one element should be to the left or right of another element.

Which sorting algorithm is used in arrays sort?

As mentioned in the official JavaDoc, Arrays. sort uses dual-pivot Quicksort on primitives. It offers O(n log(n)) performance and is typically faster than traditional (one-pivot) Quicksort implementations. However, it uses a stable, adaptive, iterative implementation of mergesort algorithm for Array of Objects.

Which sorting algorithm is used for user defined data types in arrays sort ()?

Arrays. sort(Object[]) is based on the TimSort algorithm, giving us a time complexity of O(n log(n)). In short, TimSort makes use of the Insertion sort and the MergeSort algorithms. However, it is still slower compared to other sorting algorithms like some of the QuickSort implementations.


The most likely reason: quicksort is not stable, i.e. equal entries can change their relative position during the sort; among other things, this means that if you sort an already sorted array, it may not stay unchanged.

Since primitive types have no identity (there is no way to distinguish two ints with the same value), this does not matter for them. But for reference types, it could cause problems for some applications. Therefore, a stable merge sort is used for those.

OTOH, a reason not to use the (guaranteed n*log(n)) stable merge sort for primitive types might be that it requires making a clone of the array. For reference types, where the referred objects usually take up far more memory than the array of references, this generally does not matter. But for primitive types, cloning the array outright doubles the memory usage.


According to Java 7 API docs cited in this answer, Arrays#Sort() for object arrays now uses TimSort, which is a hybrid of MergeSort and InsertionSort. On the other hand, Arrays#sort() for primitive arrays now uses Dual-Pivot QuickSort. These changes were implemented starting in Java SE 7.


One reason I can think of is that quicksort has a worst case time complexity of O(n^2) while mergesort retains worst case time of O(n log n). For object arrays there is a fair expectation that there will be multiple duplicate object references which is one case where quicksort does worst.

There is a decent visual comparison of various algorithms, pay particular attention to the right-most graph for different algorithms.


I was taking Coursera class on Algorithms and in one of the lectures Professor Bob Sedgewick mentioning the assessment for Java system sort:

"If a programmer is using objects, maybe space is not a critically important consideration and the extra space used by a merge sort maybe not a problem. And if a programmer is using primitive types, maybe the performance is the most important thing so they use quick sort."


java.util.Arrays uses quicksort for primitive types such as int and mergesort for objects that implement Comparable or use a Comparator. The idea of using two different methods is that if a programmer’s using objects maybe space is not a critically important consideration and so the extra space used by mergesort maybe’s not a problem and if the programmer’s using primitive types maybe performance is the most important thing so use the quicksort.

For Example: This is the example when sorting stability matters.

enter image description here

That’s why stable sorts make sense for object types, especially mutable object types and object types with more data than just the sort key, and mergesort is such a sort. But for primitive types stability is not only irrelevant. It’s meaningless.

Source: INFO