What is the fastest sort algorithm for 0-65535 integers?

Question

I have to sort a number of integers, which can have values between 30.000.000 and 350.000.000. There will be between 0 and 65.535 integers, with the average count being 20.000. RAM usage is irrelevant and speed only is important.

Later on i will also have to split them into groups, with the divide always being set whenever the gap between two of these values is >65.535, which is what i need the algorithm for.

If it makes any difference, the algorithm will be used in a Perl script.

Edit: After thinking it over and reading the answers i've come to realize something: I don't actually care about the data itself. As i really only want to find the start and end values of groups with small gaps, the sorting only needs to create buckets and can discard the data.

Edit2: After some testing and trying out the answers provided, the fastest way i found was this:

my @sort = sort {$a <=> $b} @item_offsets;
my @buckets;
my $start = shift @sort;
push @buckets, [$start,$start];
for my $item ( @sort ) {
    if ( $item < $buckets[$#buckets][1]+$gap ) {
        $buckets[$#buckets][1] = $item;
    }
    else {
        push @buckets, [$item,$item];
    }
}
say $#buckets;

Brian · Accepted Answer

I'd just make an array of buckets before running the algorithm, one for each group of 65536 consecutive values. The buckets will contain a min and max value of their contents, but won't store the contents themselves. After running the algorithm, do a single pass over the buckets. If there are two consecutive non-empty buckets with min(bucket2)-max(bucket1) < 65536, combine them. Combining will not happen until the algorithm finishes running. Discard any empty buckets. This algorithm is linear time.

Take note of Bucket Sort.

Michael Carman · Answer

It's unlikely that you'll be able to write a sort algorithm in Perl that will perform better than Perl's builtin sort function:

@numbers = sort {$a <=> $b} @numbers;

You can experiment with the sort pragma to see if a particular algorithm is better:

use sort '_quicksort';
use sort '_mergesort';

Since your cutpoints will vary depending on the data distribution, I think you need to sort the entire list first then loop over it to do the cutting.

my $prev  = shift @numbers;  # already sorted
my @group = [$prev];
my $i     = 0;

foreach my $n (@numbers) {
    $i++ if ($n - $prev > 65535);
    push @{$group[$i]}, $n;
    $prev = $n;
}

FlySwat · Answer

I'd use a radix sort, since you need to group the output.

Chris Marisic · Answer

I was just going to say radix sort, http://en.wikipedia.org/wiki/Radix_sort however that could be a bit above what you were looking to implement, Introsort is generally the accepted sorting solution for data http://en.wikipedia.org/wiki/Introsort, it's a variation of quicksort that switches to heapsort when it reaches smaller sets as it's faster on smaller sets than quicksort.

What is the fastest sort algorithm for 0-65535 integers?

Tags:

performance

algorithm

sorting

perl

Mithaldu

4 Answers

Brian

Michael Carman

FlySwat

Chris Marisic

Recent Activity

Donate For Us

What is the fastest sort algorithm for 0-65535 integers?

Tags:

performance

algorithm

sorting

perl

Mithaldu

4 Answers

Brian

Michael Carman

FlySwat

Chris Marisic

Related questions

Recent Activity

Donate For Us