Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Insertion sort much faster than shell sort

I'm reading chapter about sorting in Sedgewick's "Algorithms". Along the way I wrote 3 basic algorithms for sorting: selection, insertion and shell sort. The book says, that, despite the fact that all three have quadratic worst case complexity, shell sort should be much faster than insertion sort on random data. In the book they get 600x performance boost.

But I get following multipliers (almost do not change with increasing of array size) on my laptop:

  • selection: 5.5x
  • insertion: 1x
  • shell: 1.8x !

The question that bothers me is - why shell sort is almost two times slower, than insertion sort?!

I guess, something is wrong with my shellsort implementation. But I almost copied it from the book:

class ShellSort extends Sort {
    //precalculate sequence: 1, 4, 13, 40, 121, 364, 1093, ...
    //(3^20 - 1)/2 is enough for any array I sort
    private static final int[] SEQUENCE = new int[20];
    static {
        for(int i = 1; i <= SEQUENCE.length; i++)
            SEQUENCE[i - 1] = (int)(Math.pow(3, i) - 1) / 2;
    }

    public void sort(int[] a) {
        int length = a.length;

        int seqLen = SEQUENCE.length;
        int nth;
        int j;

        for(int seqi = seqLen - 1; seqi >= 0; seqi--) {
            if(SEQUENCE[seqi] < length / 3) {
                nth = SEQUENCE[seqi];
                for(int n = 0; n < length; n+=nth) {
                    j = n;
                    while(j > 0 && a[j] < a[j - nth]) {
                        exch(a, j, j-nth);
                        j -= nth;
                    }
                }
            }
        }
    }
}

Rest of the code for those, who would like to run test on their machine (doubling array size test with JVM heat up has no significant effect on the results, so this simple test is good enough for N > ~ 200 000).

main:

int N = 500_000;
Random rand = new Random();
int[] a = new int[N];
for(int i = 0; i < N; i++)
    a[i] = rand.nextInt();

//insertion sort
int[] aCopy = Arrays.copyOf(a, a.length);
long start = System.nanoTime();
new InsertionSort().sort(aCopy);
System.out.println("insert:\t" + (System.nanoTime() - start));

//shell sort
aCopy = Arrays.copyOf(a, a.length);
start = System.nanoTime();
new ShellSort().sort(aCopy);
System.out.println("shell:\t" + (System.nanoTime() - start));

InsertionSort and Sort classes:

class InsertionSort extends Sort {
    public void sort(int[] a) {
        int length = a.length;
        int j;
        int x;
        for(int i = 1; i < length; i++) {
            j = i;
            x = a[i];
            while(j > 0 && x < a[j-1]) {
                a[j] = a[--j];
            }
            a[j] = x;
        }
    }
}
abstract class Sort {
    abstract public void sort(int[] a);

    protected static final void exch(int[] a, int i, int j) {
        int t = a[i];
        a[i] = a[j];
        a[j] = t;
    }
}
like image 823
Oroboros102 Avatar asked Feb 27 '14 07:02

Oroboros102


People also ask

Which is better shell sort or insertion sort?

In insertion sort, we move elements only one position ahead. When an element has to be moved far ahead, many movements are involved. The idea of ShellSort is to allow the exchange of far items. In Shell sort, we make the array h-sorted for a large value of h.

Is insertion sort the fastest?

Insertion sort is a simple sorting algorithm which is most effective on smaller lists (i.e less data). It is quite slow at larger lists, but very fast with small lists.

Which sort method is fastest?

If you've observed, the time complexity of Quicksort is O(n logn) in the best and average case scenarios and O(n^2) in the worst case. But since it has the upper hand in the average cases for most inputs, Quicksort is generally considered the “fastest” sorting algorithm.

Is insertion sort faster than heap sort?

But in all case Insertion sort is very much faster compared to bubble and heap sort. Theoretically heap sort is supposed to be the best in case of worst scenario.


2 Answers

Your implementation is broken and outputs the sorted array only due to the fact that the last step is 1 and your two internal cycles perform the basic insertion sort when the step is 1. When the step is greater then 1, the two internal cycles in your implementation do anything but step-sort the array, so what you implementation does is it shuffles the array in all iterations of the outer cycle and then insertion-sorts it in the last iteration of the outer cycle. Of course it will take longer then just insertion-sort it once.

Reusing your sequence the proper shell sort implementation should look like this:

public void sort( int[] a ) {
    int length = a.length;

    int stepIndex = 0;
    while ( stepIndex < SEQUENCE.length - 1 && SEQUENCE[ stepIndex ] < length / 3 ) {
        stepIndex++;
    }

    while ( stepIndex >= 0 ) {
        int step = SEQUENCE[ stepIndex-- ];
        for ( int i = step; i < length; i++ ) { // DIFF: i++ instead of i+=step
            for ( int j = i; j >= step && a[ j ] < a[ j - step ]; j -= step ) {
                exch( a, j, j - step );
            }
        }
    }
}

Two main differences between this implementation and yours:

  • proper initial indexes for two internal cycles
  • proper index increment for middle cycle (+1 instead of +step in your code)

Also, check the http://algs4.cs.princeton.edu/21elementary/Shell.java.html for a good implementation and good step sequence.

like image 138
Oleg Estekhin Avatar answered Oct 03 '22 18:10

Oleg Estekhin


From a quick glance you can see that shell sort looks slower by having more loops. Brute force, you can put a system.out.println in the innermost loop to see how many comparisons are made.

3 Loops of shellsort

  • for(int seqi = seqLen - 1; seqi >= 0; seqi--)
  • for(int n = 0; n < length; n+=nth)
  • while(j > 0 && a[j] < a[j - nth])

2 Loops of insertion

  • for(int i = 1; i < length; i++)
  • while(j > 0 && x < a[j-1])
like image 32
jcalloway Avatar answered Oct 03 '22 18:10

jcalloway