Is it possible for the ordering of the input to affect the performance of Array.sort()
? If so, how?
Explanation: In Insertion sort if the array is already sorted then it takes O(n) and if it is reverse sorted then it takes O(n2) to sort the array. In Quick sort, if the array is already sorted or if it is reverse sorted then it takes O(n2). The best and worst case performance of Selection is O(n2) only.
Quicksort. Quicksort is one of the most efficient sorting algorithms, and this makes of it one of the most used as well. The first thing to do is to select a pivot number, this number will separate the data, on its left are the numbers smaller than it and the greater numbers on the right.
The array can be sorted in ascending order by repeatedly finding the minimum element (considering ascending order) from unsorted part and putting it at the beginning. The algorithm maintains two subarrays in a given array. The subarray which is already sorted. Remaining subarray which is unsorted.
The sort() method returns a reference to the original array, so mutating the returned array will mutate the original array as well.
This depends on a few things:
An application I am working on experienced severe performance degradation in a module which was sorting a list of 35K+ strings after the API endpoint it was hitting started offering it data in sorted order. The time spent in the front-end sort went from around 30 ms to 6 seconds (200x).
The sorting is done using a custom comparator which prioritises strings ending with a certain suffix. If none or both of the strings end in the suffix, then natural ordering is used. I profiled the module using the browser developer tools and found out that most of the time was spent doing this comparison. The profile also showed that QuickSort
is the underlying algorithm used by Array.sort()
(at least in Chrome).
This was strange, as QuickSort should not be affected by the ordering of the input. According to Wikipedia:
[the worst case] may occur if the pivot happens to be the smallest or largest element in the list, or in some implementations (e.g., the Lomuto partition scheme as described above) when all the elements are equal.
I became curious and benchmarked multiple variants of doing the sort. I used benchmark.js running on node in the command line. Both the benchmark and the browser are running on top of v8 so they should use the same sorting algorithm. The results were surprising:
6 tests completed.
Ordered array, sorted with a default comparator x 34.27 ops/sec ±1.07% (59 runs sampled)
Ordered array, sorted with a custom comparator x 0.18 ops/sec ±2.81% (5 runs sampled)
Ordered array, shuffled, sorted with a custom comparator x 38.37 ops/sec ±3.67% (51 runs sampled)
Ordered array, shuffled, sorted with a default comparator x 29.20 ops/sec ±1.28% (51 runs sampled)
Unordered array, sorted with a default comparator x 28.38 ops/sec ±1.28% (50 runs sampled)
Unordered array, sorted with a custom comparator x 42.10 ops/sec ±1.32% (55 runs sampled)
These results indicate that the performance degradation is because of the distribution of the data in relation to the comparator. Here are some characteristics of the input:
/Prod
) matches about 20% of the strings/Prod
are very likely to be spread relatively uniformly throughoutABC/Alpha
, ABC/Beta
, ABC/Prod
are common. This probably makes the algorithm more likely to select a pivot which is at an extreme in its subsequence and thus causes a very large number of comparisons between elements to be performed.
This only happens in Chrome 61. I've tested both Firefox 52.3 and Safari 10.1 and the issue is not reproduced. I presume it's because a different sorting algorithm is used.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With