Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using mergesort with presorted intervals

I am trying to prevent any unnecessary recursive calls into mergesort since my array is presorted by parts for example:

22, 233, 445, 1055, 1, 14, 94, 74545, 75, 134, 323, 9090, 2, 43, 6342, 323452

I am using the generic mergesort implementation

void merge(int a[], int low, int mid, int high)
{
    int b[10000];
    int i = low, j = mid + 1, k = 0;

    while (i <= mid && j <= high) {
        if (a[i] <= a[j])
            b[k++] = a[i++];
        else
            b[k++] = a[j++];
    }
    while (i <= mid)
        b[k++] = a[i++];

    while (j <= high)
        b[k++] = a[j++];

    k--;
    while (k >= 0) {
        a[low + k] = b[k];
        k--;
    }
}

void mergesort(int a[], int low, int high)
{
    if (low < high) {
        int m = (high + low)/2;
        mergesort(a, low, m);
        mergesort(a, m + 1, high);
        merge(a, low, m, high);
    }
}

If my program knows that every 4 elements is already sorted how can I modify mergesort to start merging from sorted groups of 4 instead of breaking the array down to the single element parts and then starting the merge?

Example:

22,233,445,1055     1,14,94,74545,     75,134,323,9090     2,43,6342,323452
      \                  /                    \                   /
       \                /                      \                 /
   1,14,22,94,233,445,1055,74545        2,43,75,134,323,6342,9090,323452
              \                                           /
               \                                         /
        1,2,14,22,43,75,94,134,233,323,445,1055,6342,9090,74545,323452  
like image 587
kirax Avatar asked Feb 19 '14 07:02

kirax


People also ask

What is the disadvantage of MergeSort compared to quick sort?

Merge sort can work well on any type of data sets irrespective of its size (either large or small). The quick sort cannot work well with large datasets. Additional storage space requirement : Merge sort is not in place because it requires additional memory space to store the auxiliary arrays.

Can MergeSort be done in place?

The standard implementation of merge sort is not in-place; but we can make it in-place by modifying the way we merge the lists. However, this will affect the run-time complexity of the algorithm. So basically, standard merge sort with a modified method to merge the lists in-place is called in-place merge sort.

Why is MergeSort a good choice for sorting a large file?

Merge sort requires more space as it creates an extra array for storing, and no matter what it will compare every item. Quick sort on the other hand does not require extra space, and doesn't swap or compare more than necessary.

Does MergeSort use recursion?

Merge sort is a recursive algorithm that continually splits a list in half. If the list is empty or has one item, it is sorted by definition (the base case).


1 Answers

What you implemented is Top-down mergesort, which means to split the array into two parts, sort each, and merge them together. This is done recursively. The problem of it is, assuming the array has 12 elements, then it would split the array into 2 6-element arrays, that would take no advantage of the fact that every 4 elements are already sorted.

You should instead use Bottom-up mergesort, that is, split the array into subarrays, each of the subarrays have 4 elements. Since each of them are already sorted, merge every two subarrays into 8-element subarrays, and then merge every two 8-element subarrays into 16-element subarrays, and so on. The code of sorting N-element array is like this:

for (int sz = 1; sz < N; sz = sz+sz)
    for (int lo = 0; lo < N-sz; lo += sz+sz)
        merge(a, lo, lo+sz-1, min(lo+sz+sz-1, N-1)); 

Since you already know that every 4 elements are sorted, you can start with sz being 4, which takes full advantage of the knowledge.

like image 116
Yu Hao Avatar answered Sep 22 '22 13:09

Yu Hao