Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

c++ sorting algorithms duration

I have been working on calculating duration of these sorting algorithms take. I looped all sorting methods 2000 times then divide total duration into 2000 to get a proper value for duration. The problem is; it does not show the exact value of time that the particular code parts of sorting methods take. I mean the duration variable shows increasing values through the program flows. For example, for N = 10000, insertionSort()gives 0.000635, mergeSort()gives 0.00836 and heapSort() gives 0.018485 and when I change order of these, duration still goes up through program, regardless of algorithm type. I tried giving different duration values for each process but that didn't work. Can someone help me to understand this problem or are there any other time measuring styles?

Sorry if this is a dumb problem and for my bad grammar.

int main(){

    srand(time(NULL));

    int N, duration;

    cout << endl << "N : ";
    cin >> N; // N is array sze.
    cout << endl;

    // a4 would be the buffer array (for calculating proper duration).
    int *a1 = new int[N];
    int *a2 = new int[N];
    int *a3 = new int[N];
    int *a4 = new int[N];

    cout << endl << "Unsorted array : " << endl;

    for (int i = 0; i < N; i++){

        a4[i] = rand() % 100;
        cout << a4[i] << " ";
    }

/*------------------------------------------------------------------------------*/

    cout << endl << endl <<"Sorting with Insertion Sort, please wait..." << endl;

    for(int i = 0; i < 2000; i++){

        a1 = a4;

        duration = clock();
        insertionSort(a1, N - 1);
        duration += clock() - duration;
    }

    cout << endl << "Insertion sort : " << endl;

    print(a1, N);

    cout << endl << endl << "Approximate duration for Insertion Sort : ";
    cout << (double) (duration / 2000) / CLOCKS_PER_SEC;
    cout << " s." << endl;

/*------------------------------------------------------------------------------*/

    cout << endl << endl << "Sorting with Merge Sort, please wait..." << endl;

    for(int i = 0; i < 2000; i++){

        a2 = a4;

        duration = clock();
        mergeSort(a2, 0, N - 1);
        duration += clock() - duration;
    }

    cout << endl << "Merge sort : " << endl;

    print(a2, N);

    cout << endl << endl << "Approximate duration for Merge Sort : ";
    cout << (double) (duration / 2000) / CLOCKS_PER_SEC;
    cout << " s."<< endl << endl;

/*------------------------------------------------------------------------------*/

    cout << endl << endl << "Sorting with Heap Sort, please wait..." << endl;

    for(int i = 0; i < 2000; i++){

        a3 = a4;
        duration = clock();
        heapSort(a3, N);
        duration += clock() - duration;
    }

    cout << endl << "Heap sort : " << endl;

    print(a3, N);

    cout << endl << endl << "Approximate duration for Heap Sort : ";
    cout << (double) (duration / 2000) / CLOCKS_PER_SEC;
    cout << " s."<< endl << endl;

    return 0;
}
like image 855
burakongun Avatar asked Jan 16 '23 00:01

burakongun


2 Answers

The error in your program is that you reset duration throughout the loop. A cleaner way to handle the time would be to put the duration variable modification outside the for loop. For example:

duration = clock();
for(int i = 0; i < 2000; i++){
    a2 = a4;
    mergeSort(a2, 0, N - 1);
}
duration = clock() - duration

EDIT: forgot to remove the part inside the loop. Fixed now.

like image 119
jma127 Avatar answered Jan 22 '23 16:01

jma127


Number one, you don't seem to reset duration between runs of the different sorts. This means the sum of individual iteration durations would be propagating down through each sorting phase (if the next point weren't also a problem).

Next, you need to setup a separate variable, call it durationSum and use that as you are currently using duration in the summary phase after iterating. Currently, you're blowing away your sum on every iteration.

For example:

clock_t durationSum = 0;
clock_t duration = 0;

for(int i = 0; i < 2000; i++){

    a1 = a4;

    duration = clock();
    insertionSort(a1, N - 1);
    durationSum += clock() - duration;
}

Next, you're making a type error when amortizing duration. You have:

cout << (double) (duration / 2000) / CLOCKS_PER_SEC;

With minimal edits, this would work more precisely (but should use durationSum):

cout << (double) (duration / 2000.0) / CLOCKS_PER_SEC;

Before, you were saying "use integer division to divide duration by 2000, THEN promote it to a double and divide by CLOCKS_PER_SEC (this time with floating-point division because one of the operands is a double and one integral). Using 2000.0 forces duration to be promoted to a double for a floating-point division by 2000.

Finally, it would be better to consider the loop overhead negligible compared to a single sort iteration and do only two calls to clock() per 2000 sort-iterations.

For example:

clock_t insert_sort_start = clock();

for(int i = 0; i < 2000; i++){
    a1 = a4;
    insertionSort(a1, N - 1);
}

double duration_sec = (clock() - insert_sort_start) / 2000.0 / CLOCKS_PER_SEC;

Finally, note that you are using duration as an int whereas in reality it is a clock_t, and if you are on a 64-bit system, it is very likely that this is a 64-bit number being returned by clock() and "narrowed" (downcast) into a 32-bit integer int. Use clock_t.

like image 21
Matthew Hall Avatar answered Jan 22 '23 16:01

Matthew Hall