Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Test case for Insertion Sort, MergeSort and Quick Sort

I have implemented (in Java) Insertion Sort, MergeSort, ModifiedMergeSort and Quick Sort:

ModifiedMergeSort has a variable for the "bound" of elements. When the elements to be sorted are less than or equal to "bound" then use Insertion Sort to sort them.

How come Version 1 is better than Versions 3, 4 and 5?

Are the results for Versions 2 and 6 realistic?

Here is my results (In Milliseconds):

Version 1 - Insertion Sort: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       14              19              14.96
N = 20000       59              60              59.3
N = 40000       234             277             243.1

Version 2 - Merge Sort: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       1               15              1.78
N = 20000       3               8               3.4
N = 40000       6               9               6.7

Version 3 - Merge Sort and Insertion Sort on 15 elements: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       41              52              45.42
N = 20000       170             176             170.56
N = 40000       712             823             728.08

Version 4 - Merge Sort and Insertion Sort on 30 elements: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       27              33              28.04
N = 20000       113             119             114.36
N = 40000       436             497             444.2

Version 5 - Merge Sort and Insertion Sort on 45 elements: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       20              21              20.68
N = 20000       79              82              79.7
N = 40000       321             383             325.64

Version 6 - Quick Sort: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       1               9               1.18
N = 20000       2               3               2.06
N = 40000       4               5               4.26

Here is my code:

package com.testing;

import com.sorting.InsertionSort;
import com.sorting.MergeSort;
import com.sorting.ModifiedMergeSort;
import com.sorting.RandomizedQuickSort;

/**
 *
 * @author mih1406
 */
public class Main {
    private static final int R = 50; // # of tests
    private static int N = 0; // Input size
    private static int[] array; // Tests array
    private static int[] temp; // Tests array

    private static long InsertionSort_best = -1;
    private static long InsertionSort_worst = -1;
    private static double InsertionSort_average = -1.0;

    private static long MergeSort_best = -1;
    private static long MergeSort_worst = -1;
    private static double MergeSort_average = -1.0;

    private static long ModifiedMergeSort_V3_best = -1;
    private static long ModifiedMergeSort_V3_worst = -1;
    private static double ModifiedMergeSort_V3_average = -1.0;

    private static long ModifiedMergeSort_V4_best = -1;
    private static long ModifiedMergeSort_V4_worst = -1;
    private static double ModifiedMergeSort_V4_average = -1.0;

    private static long ModifiedMergeSort_V5_best = -1;
    private static long ModifiedMergeSort_V5_worst = -1;
    private static double ModifiedMergeSort_V5_average = -1.0;

    private static long RandomizedQuickSort_best = -1;
    private static long RandomizedQuickSort_worst = -1;
    private static double RandomizedQuickSort_average = -1.0;


    public static void main(String args[]) {
        StringBuilder InsertionSort_text = new StringBuilder(
                "Version 1 - Insertion Sort: Run-Times over 50 test runs\n");

        StringBuilder MergeSort_text = new StringBuilder(
                "Version 2 - Merge Sort: Run-Times over 50 test runs\n");

        StringBuilder ModifiedMergeSort_V3_text = new StringBuilder(
                "Version 3 - Merge Sort and Insertion Sort on 15 elements: Run-Times over 50 test runs\n");

        StringBuilder ModifiedMergeSort_V4_text = new StringBuilder(
                "Version 4 - Merge Sort and Insertion Sort on 30 elements: Run-Times over 50 test runs\n");

        StringBuilder ModifiedMergeSort_V5_text = new StringBuilder(
                "Version 5 - Merge Sort and Insertion Sort on 45 elements: Run-Times over 50 test runs\n");

        StringBuilder RandomizedQuickSort_text = new StringBuilder(
                "Version 6 - Quick Sort: Run-Times over 50 test runs\n");

        InsertionSort_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        MergeSort_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        ModifiedMergeSort_V3_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        ModifiedMergeSort_V4_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        ModifiedMergeSort_V5_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        RandomizedQuickSort_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        // Case N = 10000
        N = 10000;
        fillArray();
        testing_InsertionSort();
        testing_MergeSort();
        testing_ModifiedMergeSort(15);
        testing_ModifiedMergeSort(30);
        testing_ModifiedMergeSort(45);
        testing_RandomizedQuickSort();

        InsertionSort_text.append("N = " + N + "\t\t" + InsertionSort_best
                + "\t\t\t" + InsertionSort_worst + "\t\t\t"
                + InsertionSort_average + "\n");

        MergeSort_text.append("N = " + N + "\t\t" + MergeSort_best
                + "\t\t\t" + MergeSort_worst + "\t\t\t"
                + MergeSort_average + "\n");

        ModifiedMergeSort_V3_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V3_best
                + "\t\t\t" + ModifiedMergeSort_V3_worst + "\t\t\t"
                + ModifiedMergeSort_V3_average + "\n");

        ModifiedMergeSort_V4_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V4_best
                + "\t\t\t" + ModifiedMergeSort_V4_worst + "\t\t\t"
                + ModifiedMergeSort_V4_average + "\n");

        ModifiedMergeSort_V5_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V5_best
                + "\t\t\t" + ModifiedMergeSort_V5_worst + "\t\t\t"
                + ModifiedMergeSort_V5_average + "\n");

        RandomizedQuickSort_text.append("N = " + N + "\t\t" + RandomizedQuickSort_best
                + "\t\t\t" + RandomizedQuickSort_worst + "\t\t\t"
                + RandomizedQuickSort_average + "\n");

        // Case N = 20000
        N = 20000;
        fillArray();
        testing_InsertionSort();
        testing_MergeSort();
        testing_ModifiedMergeSort(15);
        testing_ModifiedMergeSort(30);
        testing_ModifiedMergeSort(45);
        testing_RandomizedQuickSort();

        InsertionSort_text.append("N = " + N + "\t\t" + InsertionSort_best
                + "\t\t\t" + InsertionSort_worst + "\t\t\t"
                + InsertionSort_average + "\n");

        MergeSort_text.append("N = " + N + "\t\t" + MergeSort_best
                + "\t\t\t" + MergeSort_worst + "\t\t\t"
                + MergeSort_average + "\n");

        ModifiedMergeSort_V3_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V3_best
                + "\t\t\t" + ModifiedMergeSort_V3_worst + "\t\t\t"
                + ModifiedMergeSort_V3_average + "\n");

        ModifiedMergeSort_V4_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V4_best
                + "\t\t\t" + ModifiedMergeSort_V4_worst + "\t\t\t"
                + ModifiedMergeSort_V4_average + "\n");

        ModifiedMergeSort_V5_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V5_best
                + "\t\t\t" + ModifiedMergeSort_V5_worst + "\t\t\t"
                + ModifiedMergeSort_V5_average + "\n");

        RandomizedQuickSort_text.append("N = " + N + "\t\t" + RandomizedQuickSort_best
                + "\t\t\t" + RandomizedQuickSort_worst + "\t\t\t"
                + RandomizedQuickSort_average + "\n");

        // Case N = 40000
        N = 40000;
        fillArray();
        testing_InsertionSort();
        testing_MergeSort();
        testing_ModifiedMergeSort(15);
        testing_ModifiedMergeSort(30);
        testing_ModifiedMergeSort(45);
        testing_RandomizedQuickSort();

        InsertionSort_text.append("N = " + N + "\t\t" + InsertionSort_best
                + "\t\t\t" + InsertionSort_worst + "\t\t\t"
                + InsertionSort_average + "\n");

        MergeSort_text.append("N = " + N + "\t\t" + MergeSort_best
                + "\t\t\t" + MergeSort_worst + "\t\t\t"
                + MergeSort_average + "\n");

        ModifiedMergeSort_V3_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V3_best
                + "\t\t\t" + ModifiedMergeSort_V3_worst + "\t\t\t"
                + ModifiedMergeSort_V3_average + "\n");

        ModifiedMergeSort_V4_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V4_best
                + "\t\t\t" + ModifiedMergeSort_V4_worst + "\t\t\t"
                + ModifiedMergeSort_V4_average + "\n");

        ModifiedMergeSort_V5_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V5_best
                + "\t\t\t" + ModifiedMergeSort_V5_worst + "\t\t\t"
                + ModifiedMergeSort_V5_average + "\n");

        RandomizedQuickSort_text.append("N = " + N + "\t\t" + RandomizedQuickSort_best
                + "\t\t\t" + RandomizedQuickSort_worst + "\t\t\t"
                + RandomizedQuickSort_average + "\n");

        System.out.println(InsertionSort_text);
        System.out.println(MergeSort_text);
        System.out.println(ModifiedMergeSort_V3_text);
        System.out.println(ModifiedMergeSort_V4_text);
        System.out.println(ModifiedMergeSort_V5_text);
        System.out.println(RandomizedQuickSort_text);

    }

    private static void fillArray() {
        // (re)creating the array
        array = new int[N];

        // filling the array with random numbers
        // using for-loop and the given random generator instruction
        for(int i = 0; i < array.length; i++) {
            array[i] = (int)(1 + Math.random() * (N-0+1));
        }
    }

    private static void testing_InsertionSort() {
        // run-times arrays
        long [] run_times = new long[R];

        // start/finish times
        long start;
        long finish;
        for(int i = 0; i < R; i++) {
            copyArray();
            // recording start time
            start = System.currentTimeMillis();

            // testing the algorithm
            InsertionSort.sort(temp);

            // recording finish time
            finish = System.currentTimeMillis();

            run_times[i] = finish-start;
        }

        InsertionSort_best = findMin(run_times);
        InsertionSort_worst = findMax(run_times);
        InsertionSort_average = findAverage(run_times);
    }

    private static void testing_MergeSort() {
        // run-times arrays
        long [] run_times = new long[R];

        // start/finish times
        long start;
        long finish;
        for(int i = 0; i < R; i++) {
            copyArray();
            // recording start time
            start = System.currentTimeMillis();

            // testing the algorithm
            MergeSort.sort(temp);

            // recording finish time
            finish = System.currentTimeMillis();

            run_times[i] = finish-start;
        }

        MergeSort_best = findMin(run_times);
        MergeSort_worst = findMax(run_times);
        MergeSort_average = findAverage(run_times);
    }

    private static void testing_ModifiedMergeSort(int bound) {
        // run-times arrays
        long [] run_times = new long[R];

        // setting the bound
        ModifiedMergeSort.bound = bound;

        // start/finish times
        long start;
        long finish;
        for(int i = 0; i < R; i++) {
            copyArray();
            // recording start time
            start = System.currentTimeMillis();

            // testing the algorithm
            ModifiedMergeSort.sort(temp);

            // recording finish time
            finish = System.currentTimeMillis();

            run_times[i] = finish-start;
        }

        if(bound == 15) {
            ModifiedMergeSort_V3_best = findMin(run_times);
            ModifiedMergeSort_V3_worst = findMax(run_times);
            ModifiedMergeSort_V3_average = findAverage(run_times);
        }

        if(bound == 30) {
            ModifiedMergeSort_V4_best = findMin(run_times);
            ModifiedMergeSort_V4_worst = findMax(run_times);
            ModifiedMergeSort_V4_average = findAverage(run_times);
        }

        if(bound == 45) {
            ModifiedMergeSort_V5_best = findMin(run_times);
            ModifiedMergeSort_V5_worst = findMax(run_times);
            ModifiedMergeSort_V5_average = findAverage(run_times);
        }
    }

    private static void testing_RandomizedQuickSort() {
        // run-times arrays
        long [] run_times = new long[R];

        // start/finish times
        long start;
        long finish;
        for(int i = 0; i < R; i++) {
            copyArray();
            // recording start time
            start = System.currentTimeMillis();

            // testing the algorithm
            RandomizedQuickSort.sort(temp);

            // recording finish time
            finish = System.currentTimeMillis();

            run_times[i] = finish-start;
        }

        RandomizedQuickSort_best = findMin(run_times);
        RandomizedQuickSort_worst = findMax(run_times);
        RandomizedQuickSort_average = findAverage(run_times);
    }

    private static long findMax(long[] times) {
        long max = times[0];

        for(int i = 1; i < times.length; i++) {
            if(max < times[i]) {
                max = times[i];
            }
        }

        return max;
    }

    private static long findMin(long[] times) {
        long min = times[0];

        for(int i = 1; i < times.length; i++) {
            if(min > times[i]) {
                min = times[i];
            }
        }

        return min;
    }

    private static double findAverage(long[] times) {
        long sum = 0;
        double avg;

        for(int i = 0; i < times.length; i++) {
            sum += times[i];
        }

        avg = (double)sum/(double)times.length;

        return avg;
    }

    private static void copyArray() {
        temp = new int[N];
        System.arraycopy(array, 0, temp, 0, array.length);
    }
}
like image 439
malhobayyeb Avatar asked Mar 10 '13 00:03

malhobayyeb


1 Answers

There seem to be some systematic errors on the approach you're currently undertaking. I'll state some of the most obvious experimental issues you're facing, even if they might not directly be the culprits of your experimental results.

JVM Compilation

As I've stated previously in a comment, the JVM will by default run your code in interpreted mode. That means each bytecode instruction found in your methods will be interpreted on-the-spot, and then ran.

The advantage of this approach is that it allows your application to have faster startup times than would a Java program that'd be compiled to native code on the startup of each of your runs.

The downside is that while there's no startup performance hit, you'll get a slower performing program than you'd get otherwise.

To ameliorate both concerns, a tradeoff was taken by the JVM team. Initially your program will be interpreted, but the JVM will gather information about which methods (or parts of methods) are being used intensively, and will compile down only those ones that seem to be used a lot. When compiling, you'll get a small performance hit, but then the code will be way faster than was before.

You'll have to take this fact into consideration when doing measurements.

The standard approach is to "warm up the JVM", that is, to run your algorithms for a bit to make sure the JVM does all the compilations it needs to do. It is important to have the code that warms the JVM be the same as the one you'll want to benchmark, otherwise there may be some compilation while you're benchmarking your code.

Measuring time

When measuring time, you should use System.nanoTime() instead of System.currentTimeMillis. I won't go over the details, those can be accessed here.

You should also be careful. The following blocks of code may seem equivalent at first, but will yield different results most of the times:

totalDuration = 0;
for (i = 0; i < 1000; ++i)
    startMeasure = now();
    algorithm();
    endMeasure = now();
    duration = endMeasure - startMeasure;
    totalDuration += duration;
}

//...

TRIALS_COUNT = 1000;
startMeasure = now();
for (i = 0; i < TRIALS_COUNT; ++i)
    algorithm();
}
endMeasure = now();
 duration = endMeasure - startMeasure / TRIALS_COUNT;

Why? Because especially for faster algorithm() implementations, the faster they are, the harder it is to get accurate results.

Large input values

The asymptotic notation is useful to understand how different algorithms will escalate for big values of n. Applying them for small input values is often nonsensical, because at that magnitude generally what you'd want is to count the precise number of operations and their associated costs (something akin to what Jakub did). Big O notation only makes sense for big Os most of the time. Big O will tell you how the algorithm works for excruciating input value sizes, not how it will behave for small numbers. The canonical example is for instance QuickSort, which for big arrays will be king, but that will be generally slower for arrays of size 4 or 5 than Selection or Insertion Sort. Your input sizes seem to be big enough, though.

On a final note

and as previously stated by Chang, the mathematical exercise done by Jakub is completely wrong and should not be taken into consideration.

like image 117
devoured elysium Avatar answered Oct 04 '22 20:10

devoured elysium