Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In what situations do I use these sorting algorithms?

I know the implementation for most of these algorithms, but I don't know for what sized data sets to use them for (and the data included):

  1. Merge Sort
  2. Bubble Sort (I know, not very often)
  3. Quick Sort
  4. Insertion Sort
  5. Selection Sort
  6. Radix Sort
like image 340
Jake Byman Avatar asked Mar 14 '13 08:03

Jake Byman


People also ask

What can sorting algorithms be used for?

A Sorting Algorithm is used to rearrange a given array or list of elements according to a comparison operator on the elements. The comparison operator is used to decide the new order of elements in the respective data structure.

Where are sorting algorithms used in real life?

Some of the best examples of real-world implementation of the same are: Bubble sorting is used in programming TV to sort channels based on audience viewing time! Databases use external merge sort to sort sets of data that are too large to be loaded entirely into memory!

What is the importance of sorting algorithms in real life situation?

A sorting algorithm will put items in a list into an order, such as alphabetical or numerical order. For example, a list of customer names could be sorted into alphabetical order by surname, or a list of people could be put into numerical order by age.


2 Answers

First of all, you take all the sorting algorithms that have a O(n2) complexity and throw them away.

Then, you have to study several proprieties of your sorting algorithms and decide whether each one of them will be better suited for the problem you want to solve. The most important are:

Is the algorithm in-place? This means that the sorting algorithm doesn't use any (O(1) actually) extra memory. This propriety is very important when you are running memory-critical applications.

Bubble-sort, Insertion-sort and Selection-sort use constant memory. There is an in-place variant for Merge-sort too.

Is the algorithm stable? This means that if two elements x and y are equal given your comparison method, and in the input x is found before y, then in the output x will be found before y.

Merge-sort, Bubble-sort and Insertion-sort are stable.

Can the algorithm be parallelized? If the application you are building can make use of parallel computation, you might want to choose parallelizable sorting algorithms.

More info here.

like image 71
alestanis Avatar answered Sep 28 '22 16:09

alestanis


Use Bubble Sort only when the data to be sorted is stored on rotating drum memory. It's optimal for that purpose, but not for random-access memory. These days, that amounts to "don't use Bubble Sort".

Use Insertion Sort or Selection Sort up to some size that you determine by testing it against the other sorts you have available. This usually works out to be around 20-30 items, but YMMV. In particular, when implementing divide-and-conquer sorts like Merge Sort and Quick Sort, you should "break out" to an Insertion sort or a Selection sort when your current block of data is small enough.

Also use Insertion Sort on nearly-sorted data, for example if you somehow know that your data used to be sorted, and hasn't changed very much since.

Use Merge Sort when you need a stable sort (it's also good when sorting linked lists), beware that for arrays it uses significant additional memory.

Generally you don't use "plain" Quick Sort at all, because even with intelligent choice of pivots it still has Omega(n^2) worst case but unlike Insertion Sort it doesn't have any useful best cases. The "killer" cases can be constructed systematically, so if you're sorting "untrusted" data then some user could deliberately kill your performance, and anyway there might be some domain-specific reason why your data approximates to killer cases. If you choose random pivots then the probability of killer cases is negligible, so that's an option, but the usual approach is "IntroSort" - a QuickSort that detects bad cases and switches to HeapSort.

Radix Sort is a bit of an oddball. It's difficult to find common problems for which it is best, but it has good asymptotic limit for fixed-width data (O(n), where comparison sorts are Omega(n log n)). If your data is fixed-width, and the input is larger than the number of possible values (for example, more than 4 billion 32-bit integers) then there starts to be a chance that some variety of radix sort will perform well.

like image 39
Steve Jessop Avatar answered Sep 28 '22 18:09

Steve Jessop