Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maximum Subarray: Divide and Conquer

Disclaimer: this is for an assignment. I am not asking for explicit code, only enough help to understand the algorithm involved so that I might fix the errors in my code.

Okay, so you're probably familiar with the maximum subarray problem: calculate and return the largest contiguous block of integers in an array. Simple enough, but this assignment requires me to do it in three different complexities: O(n^3), O(n^2), and O(n log n). I've gotten the first two without much trouble (brute force) but the third is giving me headaches.. literally.

I understand how the algorithm is supposed to work: an array is passed to a function which splits it up into halves recursively, then compares the individual components to find the maximum subarray in each half. Then, because the maximum subarray must resides entirely in the left or right half, or in a segment overlapping the two, you must find the maximum subarray that overlaps the left and right. Compare the max subarrays for each case, and the largest will be your return value.

I believe that I have written code that performs that task adequately, but I seem to be wrong in that assessment. I've been trying to contact the instructor and TA for help, but I don't feel that I'm getting anywhere with either of them. Below is the code I've managed to write thus far. Please tell me if you see any glaring errors. Again, I am not looking for explicit code or answers, but help understanding what I'm doing wrong. I've looked through all the similar cases presented here and haven't found anything that can really help me. I've also done plenty of Google searches for guidance, but that doesn't help much, either. Anyway, here's the code in question:

int conquer(int arr[], int first, int mid, int last) {

    int i = 0;
    int maxLeft = 0;
    int maxRight = 0;

    int temp = 0;
    for (i = mid; i >= first; i--) {
        temp = temp + arr[i];
        if (maxLeft < temp) {
            maxLeft = temp;
        }
    }
    temp = 0;
    for (i = (mid + 1); i <= last; i++) {
        temp = temp + arr[i];
        if (maxRight < temp) {
            maxRight = temp;
        }
    }

    return (maxLeft + maxRight);
}

int divide(int arr[], int start, int end) {

    int i;
    int maxSum;
    int maxLeftSum;
    int maxRightSum;
    int maxOverlapSum;

    if (start == end) {
        return arr[start];
    } else {
        int first = start;
        int mid = (end / 2);
        int last = end;

        maxSum = 0;
        maxLeftSum = 0;

        for (i = first; i < mid; i++) {
            maxSum = maxSum + arr[i];
            if (maxLeftSum < maxSum) {
                maxLeftSum = maxSum;
            }
        }
        for (i = (mid + 1); i < last; i++) {
            maxSum = maxSum + arr[i];
            if (maxRightSum < maxSum) {
                maxRightSum = maxSum;
            }
        }

        maxOverlapSum = conquer(arr, first, mid, last);
    }

    if ((maxLeftSum > maxRightSum) && (maxLeftSum > maxOverlapSum)) {
        return maxLeftSum;
    } else if ((maxRightSum > maxLeftSum) && (maxRightSum > maxOverlapSum)) {
        return maxRightSum;
    } else
        return maxOverlapSum;
}

Edit: the error I'm getting is an incorrect result. I have consistent and correct results between my two other algorithms, but this one is incorrect.

Edit #2: Here is an updated version of my code, slimmed down a bit and I fixed a few things. It still isn't running correctly, but it should be much closer...

#include <stdio.h>
#include <stdlib.h>

int numarr[] = {22, -27, 38, -34, 49, 40, 13, -44, -13, 28, 46, 7, -26, 42,
        29, 0, -6, 35, 23, -37, 10, 12, -2, 18, -12, -49, -10, 37, -5, 17,
        6, -11, -22, -17, -50, -40, 44, 14, -41, 19, -15, 45, -23, 48, -1,
        -39, -46, 15, 3, -32, -29, -48, -19, 27, -33, -8, 11, 21, -43, 24,
        5, 34, -36, -9, 16, -31, -7, -24, -47, -14, -16, -18, 39, -30, 33,
        -45, -38, 41, -3, 4, -25, 20, -35, 32, 26, 47, 2, -4, 8, 9, 31, -28,
        36, 1, -21, 30, 43, 25, -20, -42};

int length = ((sizeof(numarr))/(sizeof(int)));

int divide(int left, int right) {

    int mid, i, temp, mLeft, mRight, mCross = 0;

    if (left == right) {
        return left;
    } else if (left > right) {
        return 0;
    } else {
        mid = (left + right) / 2;

        divide(left, mid);
        divide(mid + 1, right);

        for (i = mid; i >= left; i--) {
            temp = temp + numarr[i];
            if (mLeft < temp) {
                mLeft = temp;
            }
        }

        for (i = mid + 1; i <= right; i++) {
            temp = temp + numarr[i];
            if (mRight < temp) {
                mRight = temp;
            }
        }

        mCross = (mLeft + mRight);

        printf("mLeft:  %d\n", mLeft);
        printf("mRight: %d\n", mRight);
        printf("mCross: %d\n", mCross);

        return 0;
    }
}

int main(int argc, char const *argv[])
{
    divide(0, length);
    return 0;
}
like image 923
idigyourpast Avatar asked Feb 01 '13 02:02

idigyourpast


3 Answers

I'm still staring at your problem, but I noticed several error almost immediately.

First, if first and last are anything like their names intend, you find the midpoint incorrectly. You do this:

mid = end/2;

when it should be this:

mid = first + (last-first)/2;

Next, your first enumeration loop runs from [first,mid) (note the exclusion of mid on the right side). This loop does not include the arr[mid] element:

    for (i = first; i < mid; i++) {

Your second runs from [mid+1,last), which also, does not include the arr[mid] element:

    for (i = (mid + 1); i < last; i++) {

This leave a hole of one element, specifically arr[mid]. Now, I'm not claiming I fully understand the algorithm, as I have just barely had a chance to read it, but if you're purposed with covering the entire range from [first,last), this is likely not going to do it. Also, the textbook algorithm as presented by the paper linked by SauceMaster is at the distinct disadvantage of using a language that doesn't allow you to offset into array and pass it via pointer-decay into a function call as the base-address of an array. C allows you to do this and you should take advantage of it. I think you'll find it makes the numbers easier to understand, and eliminate the need for one of your passed-in indexes.

For example: a function that takes an array and mid-splits and recurses could look something like this:

void midsplit( int arr[], int len)
{
    if (len < 2)
    {
         // base case
    }
    else
    {
        int mid = len/2;
        midsplit(arr, mid);
        midsplit(arr+mid, len-mid);

        // cumulative case
    } 
}

In each recursion, the split point is always the end of one range, and used to offset-address the second range, which is treated as its own 0-based array in the recursive call. Dunno if you can use that, but it does make it a little easier (at least for me) to grasp.

Finally, your divide doesn't seem to be doing much recursing that I can see, which is going to be a problem, since this is, after all, a recursive algorithm. It seems you're missing at least one call to divide() in there.

I may have missed something, which wouldn't be the first time, but as I said, I didn't pour too much into it (yet).

like image 148
WhozCraig Avatar answered Oct 19 '22 07:10

WhozCraig


John Bentley published a paper on this in 1984. You can find the PDF online for free: http://www.akira.ruc.dk/~keld/teaching/algoritmedesign_f03/Artikler/05/Bentley84.pdf

He begins the discussion on the O(n log n) approach on the second page.

like image 40
AndyG Avatar answered Oct 19 '22 08:10

AndyG


I think you have almost all the code you need, but these two issues stand out to me:

  • The mid variable calculation is suspect.
  • Your divide function isn't really doing any dividing.

In a recursive formulation of divide an conquer, you would recursively call your divide function on the lower half of the array, and then on the upper half of the array. The conquer step could be considered to be the code that calculates the overlap sum and returns the largest of the three candidates.

Edit: When thinking about the problem recursively, realize the purpose of the function, and leverage it. In this case, the divide function will return the maximum sub-array sum for the provided array. Therefore, the way to calculate maxLeftSum is to call divide on the left sub-array. Similarly for maxRightSum.

int divide(int arr[], int start, int end) {
    if (end > start) {
        int mid = (start + end)/2;
        int maxLeftSum = divide(arr, start, mid);
        int maxRightSum = divide(arr, mid+1, end);
        return conquer(arr, start, end, maxLeftSum, maxRightSum);
    }
    return (start == end) ? arr[start] : 0;
}

Good luck!

like image 2
jxh Avatar answered Oct 19 '22 09:10

jxh