Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to use non-comparison sorting over comparison sorting

In class we learned about a bunch of new non-comparison sorts for the sake of avoiding the lower bound of omega(nlogn) for all comparison based sorts. But what was a bit unclear to me was the pro's and con's to when to use which family of sorting algorithms.

Can't any data set be tweaked so that non-comparison sorting algorithms (radix, bucket, key-indexed) can be used? If so, what's the point of comparison sorts even existing?

Sorry for this being such a rudimentary question, but I really can't find anything online.

like image 977
Lucas Ou-Yang Avatar asked Jan 20 '13 04:01

Lucas Ou-Yang


2 Answers

Not every set of items can be tweaked to be used in non-comparison sorts in an efficient way. For example, sorting arbitrary precision numbers would require running the loop inside the bucket sort many times, killing the performance.

The problem with radix sorts of the world is that they must examine every element of every item being sorted. Comparison-based sorts, on the other hand, can skip a fair number of sub-elements (digits, characters, etc.) For example, when a comparison function checks two strings, it stops at the first difference, skipping the tails of both strings. Bucket sort, on the other hand, must examine all characters in every string*.

In general, chasing the best asymptotic complexity is not always a good strategy: the value of N where using a significantly more complex algorithm pays off is often too high to make the more complex algorithms practical. For example, quicksort has very bad worse time complexity, yet on average it beats most other algorithms hands down due to its very low overhead, making it a good choice in most practical situations.


* In practice implementations of bucket sort avoid the need to look at all sub-elements (digits, characters, etc.) by switching to a comparison-based sort as soon as the number of items in a bucket drops below a certain threshold. This hybrid approach beats both a plain comparison-based sort and a plain bucket sort.
like image 130
Sergey Kalinichenko Avatar answered Sep 21 '22 01:09

Sergey Kalinichenko


The problem with non-comparison sorting is that their complexity is usually dependent on other parameters than the size of an input. Radix sort, for instance, has O(kn) complexity, where k is the highest number of digits in an element - the question is, how does k relate to n. If k is about the same as n, the algorithm becomes O(n^2).

like image 44
Maciej Stachowski Avatar answered Sep 23 '22 01:09

Maciej Stachowski