I have 2D array of size m*m
with element values either 0s or 1s. Furthermore, each column of the array has a contiguous block of 1s (with 0 outside that block). The array itself is too large to be held in memory (as many as 10^6 rows), but for each column I can determine the lower bound, a
, and the upper bound, b
, of the 1s in that column. For a given n
, I need to find out those n
consecutive rows which have the maximum number of 1s. I can easily do it for smaller numbers by calculating the sum of each row one by one, and then choosing n
consecutive rows whose sum comes out to be maximum, but for large numbers, it is consuming too much time. Is there any efficient way for calculating this? Perhaps using Dynamic Programming?
Here is an example code fragment showing my current approach, where successive calls to read_int()
(not given here) provide the lower and upper bounds for successive columns:
long int harr[10000]={0}; //initialized to zero
for(int i=0;i<m;i++)
{
a=read_int();
b=read_int();
for(int j=a;j<=b;j++) // for finding sum of each row
harr[j]++;
}
answer=0;
for(int i=0;i<n;i++)
{
answer=answer+harr[i];
}
current=answer;
for(int i=n;i<m;i++)
{
current=current+harr[i]-harr[i-n];
if(current>answer)
{
answer=current;
}
}
For example (with m
= 6 and n
= 3)
Here the answer would be row 1 to row 3 with a total 1-count of 13 in those rows. (Row 2 to row 4 also maximizes the sum as there is a tie.)
Solution steps We initialize the variable maxOneCount = 0 to track max one count so far. We run nested loops to explore each subarray i.e. outer loop from i = 0 to n - 1 and inner loop from j = i to n - 1. Before starting the inner loop, we also initialize variable oneCount = 0 to track the ones count in the subarray.
We use arrayname. length to determine the number of rows in a 2D array because the length of a 2D array is equal to the number of rows it has. The number of columns may vary row to row, which is why the number of rows is used as the length of the 2D array.
Here is a different approach. Think of each pair a
, b
as defining an interval of the form [a,b+1). The task is to find the n
consecutive indices which maximizes the sum of the parenthesis depth of the numbers in that interval. Every new a
bumps the parenthesis depth at a
up by 1. Every new b
causes the parenthesis depth after b
to go down by 1. In the first pass -- just load these parentheses depth deltas. Then one pass gets the parenthesis depths from these deltas. The following code illustrates this approach. I reduced m
to 6 for testing purposes and replaced calls to the unkown read_int()
by accesses to hard-wired arrays (which correspond to the example in the question):
#include <stdio.h>
int main(void){
int a,b,answer,current,lower,upper;
int n = 3;
int lower_bound[6] = {0,1,2,3,1,2};
int upper_bound[6] = {3,4,3,5,2,4};
int m = 6;
int harr[6]={0};
//load parenthesis depth-deltas (all initially 0)
for(int i=0;i<m;i++)
{
a = lower_bound[i];
b = upper_bound[i];
harr[a]++;
if(b < m-1)harr[b+1]--;
}
//determine p-depth at each point
for(int i = 1; i < m; i++){
harr[i] += harr[i-1];
}
//find optimal n-rows by sliding-window
answer = 0;
for(int i=0;i<n;i++)
{
answer = answer+harr[i];
}
current =answer;
lower = 0;
upper = n-1;
for(int i=n;i<m;i++)
{
current = current+harr[i]-harr[i-n];
if(current>answer)
{
answer = current;
lower = i-n+1;
upper = i;
}
}
printf("Max %d rows are %d to %d with a total sum of %d ones\n", n,lower,upper,answer);
return 0;
}
(Obviously, the loop which loads harr
can be combined with the loop which computes answer
. I kept it as two passes to better illustrate the logic of how the final harr
values can be obtained from the parentheses deltas).
When this code is compiled and run its output is:
Max 3 rows are 1 to 3 with a total sum of 13 ones
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With