Input: A 2-dimensional array NxN - Matrix - with positive and negative elements.
Output: A submatrix of any size such that its summation is the maximum among all possible submatrices.
Requirement: Algorithm complexity to be of O(N^3)
History: With the help of the Algorithmist, Larry and a modification of Kadane's Algorithm, i managed to solve the problem partly which is determining the summation only - below in Java.
Thanks to Ernesto who managed to solve the rest of the problem which is determining the boundaries of the matrix i.e. top-left, bottom-right corners - below in Ruby.
For each such matrix, in corresponding rows, there are n + 1 submatrices (exactly one of width 1,2,3··· ,n + 1). Hence, in total there are m(m + 1)(n + 1) 2 submatrices which are newly added. This can be written as a recursive formula: f(m, n + 1) = f(m, n) + m(m + 1)(n + 1) 2 .
A matrix is given. We need to find a rectangle (sometimes square) matrix, whose sum is maximum. The idea behind this algorithm is to fix the left and right columns and try to find the sum of the element from the left column to right column for each row, and store it temporarily.
Let us suppose the index of an element be (X, Y) in 0 based indexing, then the number of submatrices (Sx,y) for this element will be in can be given by the formula Sx,y = (X + 1) * (Y + 1) * (N – X) * (N – Y).
Here's an explanation to go with the posted code. There are two key tricks to make this work efficiently: (I) Kadane's algorithm and (II) using prefix sums. You also need to (III) apply the tricks to the matrix.
Part I: Kadane's algorithm
Kadane's algorithm is a way to find a contiguous subsequence with maximum sum. Let's start with a brute force approach for finding the max contiguous subsequence and then consider optimizing it to get Kadane's algorithm.
Suppose you have the sequence:
-1, 2, 3, -2
For the brute force approach, walk along the sequence generating all possible subsequences as shown below. Considering all possibilities, we can start, extend, or end a list with each step.
At index 0, we consider appending the -1 -1, 2, 3, -2 ^ Possible subsequences: -1 [sum -1] At index 1, we consider appending the 2 -1, 2, 3, -2 ^ Possible subsequences: -1 (end) [sum -1] -1, 2 [sum 1] 2 [sum 2] At index 2, we consider appending the 3 -1, 2, 3, -2 ^ Possible subsequences: -1, (end) [sum -1] -1, 2 (end) [sum -1] 2 (end) [sum 2] -1, 2, 3 [sum 4] 2, 3 [sum 5] 3 [sum 3] At index 3, we consider appending the -2 -1, 2, 3, -2 ^ Possible subsequences: -1, (end) [sum -1] -1, 2 (end) [sum 1] 2 (end) [sum 2] -1, 2 3 (end) [sum 4] 2, 3 (end) [sum 5] 3, (end) [sum 3] -1, 2, 3, -2 [sum 2] 2, 3, -2 [sum 3] 3, -2 [sum 1] -2 [sum -2]
For this brute force approach, we finally pick the list with the best sum, (2, 3)
, and that's the answer. However, to make this efficient, consider that you really don't need to keep every one of the lists. Out of the lists that have not ended, you only need to keep the best one, the others cannot do any better. Out of the lists that have ended, you only might need to keep the best one, and only if it's better than ones that have not ended.
So, you can keep track of what you need with just a position array and a sum array. The position array is defined like this: position[r] = s
keeps track of the list which ends at r
and starts at s
. And, sum[r]
gives a sum for the subsequence ending at index r
. This is optimized approach is Kadane's algorithm.
Running through the example again keeping track of our progress this way:
At index 0, we consider appending the -1 -1, 2, 3, -2 ^ We start a new subsequence for the first element. position[0] = 0 sum[0] = -1 At index 1, we consider appending the 2 -1, 2, 3, -2 ^ We choose to start a new subsequence because that gives a higher sum than extending. position[0] = 0 sum[0] = -1 position[1] = 1 sum[1] = 2 At index 2, we consider appending the 3 -1, 2, 3, -2 ^ We choose to extend a subsequence because that gives a higher sum than starting a new one. position[0] = 0 sum[0] = -1 position[1] = 1 sum[1] = 2 position[2] = 1 sum[2] = 5 Again, we choose to extend because that gives a higher sum that starting a new one. -1, 2, 3, -2 ^ position[0] = 0 sum[0] = -1 position[1] = 1 sum[1] = 2 position[2] = 1 sum[2] = 5 positions[3] = 3 sum[3] = 3
Again, the best sum is 5 and the list is from index 1 to index 2, which is (2, 3).
Part II: Prefix sums
We want to have a way to compute the sum along a row, for any start point to any endpoint. I want to compute that sum in O(1) time rather than just adding, which takes O(m) time where m is the number of elements in the sum. With some precomputing, this can be achieved. Here's how. Suppose you have a matrix:
a d g b e h c f i
You can precompute this matrix:
a d g a+b d+e g+h a+b+c d+e+f g+h+i
Once that is done you can get the sum running along any column from any start to endpoint in the column just by subtracting two values.
Part III: Bringing tricks together to find the max submatrix
Assume that you know the top and bottom row of the max submatrix. You could do this:
Now, what about actually figuring out the top and bottom row? Just try all possibilities. Try putting the top anywhere you can and putting the bottom anywhere you can, and run the Kadane-base procedure described previously for every possibility. When you find a max, you keep track of the top and bottom position.
Finding the row and column takes O(M^2) where M is the number of rows. Finding the column takes O(N) time where N is the number of columns. So total time is O(M^2 * N). And, if M=N, the time required is O(N^3).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With