Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest sorting algorithm for a specific situation

What is the fastest sorting algorithm for a large number (tens of thousands) of groups of 9 positive double precision values, where each group must be sorted individually? So it's got to sort fast a small number of possibly repeated double precision values, many times in a row. The values are in the [0..1] interval. I don't care about space complexity or stability, just about speed.

like image 550
luvieere Avatar asked Jun 08 '10 14:06

luvieere


3 Answers

Sorting each group individually, merge sort would probably be easiest to implement with good results.

A sorting network would probably be the fastest solution: http://en.wikipedia.org/wiki/Sorting_network

like image 149
Tom Gullen Avatar answered Nov 16 '22 00:11

Tom Gullen


Good question because this comes down to "Fastest way to sort an array of 9 elements", and most comparisons between and analysis of sorting methods are about large N. I assume the 'groups' are clearly defined and don't play a real role here.

You will probably have to benchmark a few candidates because a lot of factors (locality) come into play here.

In any case, making it parallel sounds like a good idea. Use Parallel.For() if you can use ,NET4.

like image 43
Henk Holterman Avatar answered Nov 16 '22 00:11

Henk Holterman


I think you will need to try out a few examples to see what works best, as you have an unusual set of conditions. My guess that the best will be one of

  • sorting network
  • insertion sort
  • quick sort (one level -- insertion sort below)
  • merge sort

Given that double precision number are relatively long I suspect you will not do better with a radix sort, but feel free to add it in.

For what its worth, Java uses quick sort on doubles until the number of items to be sorted drops below 7, at which is uses insertion sort. The third option mimics that solution.

Also your overall problem is embarrassingly parallel so you want to make use of parallelism when possible. The problem looks too small for a distributed solution (more time would be lost in networking than saved), but if set up right, your problem can make use of multiple cores very effectively.

like image 43
Kathy Van Stone Avatar answered Nov 16 '22 02:11

Kathy Van Stone