Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When is each sorting algorithm used? [closed]

People also ask

When should we use which sorting algorithm?

Quicksort is probably more effective for datasets that fit in memory. For larger data sets it proves to be inefficient so algorithms like merge sort are preferred in that case. Quick Sort in is an in-place sort (i.e. it doesn't require any extra storage) so it is appropriate to use it for arrays.

Which sorting algorithm is best when list is already sorted?

Insertion sort runs much more efficiently if the array is already sorted or "close to sorted."

What are four commonly used sorting algorithms?

When it comes to Computer Science, there are four main algorithms that you need to have in your arsenal. Bubble sort, selections sort, merge sort, and quickSort.


First, a definition, since it's pretty important: A stable sort is one that's guaranteed not to reorder elements with identical keys.

Recommendations:

Quick sort: When you don't need a stable sort and average case performance matters more than worst case performance. A quick sort is O(N log N) on average, O(N^2) in the worst case. A good implementation uses O(log N) auxiliary storage in the form of stack space for recursion.

Merge sort: When you need a stable, O(N log N) sort, this is about your only option. The only downsides to it are that it uses O(N) auxiliary space and has a slightly larger constant than a quick sort. There are some in-place merge sorts, but AFAIK they are all either not stable or worse than O(N log N). Even the O(N log N) in place sorts have so much larger a constant than the plain old merge sort that they're more theoretical curiosities than useful algorithms.

Heap sort: When you don't need a stable sort and you care more about worst case performance than average case performance. It's guaranteed to be O(N log N), and uses O(1) auxiliary space, meaning that you won't unexpectedly run out of heap or stack space on very large inputs.

Introsort: This is a quick sort that switches to a heap sort after a certain recursion depth to get around quick sort's O(N^2) worst case. It's almost always better than a plain old quick sort, since you get the average case of a quick sort, with guaranteed O(N log N) performance. Probably the only reason to use a heap sort instead of this is in severely memory constrained systems where O(log N) stack space is practically significant.

Insertion sort: When N is guaranteed to be small, including as the base case of a quick sort or merge sort. While this is O(N^2), it has a very small constant and is a stable sort.

Bubble sort, selection sort: When you're doing something quick and dirty and for some reason you can't just use the standard library's sorting algorithm. The only advantage these have over insertion sort is being slightly easier to implement.


Non-comparison sorts: Under some fairly limited conditions it's possible to break the O(N log N) barrier and sort in O(N). Here are some cases where that's worth a try:

Counting sort: When you are sorting integers with a limited range.

Radix sort: When log(N) is significantly larger than K, where K is the number of radix digits.

Bucket sort: When you can guarantee that your input is approximately uniformly distributed.


Quicksort is usually the fastest on average, but It has some pretty nasty worst-case behaviors. So if you have to guarantee no bad data gives you O(N^2), you should avoid it.

Merge-sort uses extra memory, but is particularly suitable for external sorting (i.e. huge files that don't fit into memory).

Heap-sort can sort in-place and doesn't have the worst case quadratic behavior, but on average is slower than quicksort in most cases.

Where only integers in a restricted range are involved, you can use some kind of radix sort to make it very fast.

In 99% of the cases, you'll be fine with the library sorts, which are usually based on quicksort.


The Wikipedia page on sorting algorithms has a great comparison chart.

http://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms


What the provided links to comparisons/animations do not consider is when the amount of data exceed available memory --- at which point the number of passes over the data, i.e. I/O-costs, dominate the runtime. If you need to do that, read up on "external sorting" which usually cover variants of merge- and heap sorts.

http://corte.si/posts/code/visualisingsorting/index.html and http://corte.si/posts/code/timsort/index.html also have some cool images comparing various sorting algorithms.