Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the largest basin size in a given matrix

Problem:

This is an interview Question.

A group of farmers has some elevation data, and we’re going to help them understand how rainfall flows over their farmland.

We’ll represent the land as a two-dimensional array of altitudes and use the following model, based on the idea that water flows downhill:

If a cell’s eight neighboring cells all have higher altitudes, we call this cell a basin; water collects in basin.

Otherwise, water will flow to the neighboring cell with the lowest altitude.

Cells that drain into the same sink – directly or indirectly – are said to be part of the same basin.

A few examples are below:

Input:

1 1 2
1 1 7
3 6 9

size 4

9 9 9 8 7 7
8 8 7 7 7 8
8 8 8 7 7 7
8 8 8 9 9 9
8 8 8 7 7 7
4 4 5 5 5 5
5 5 5 6 6 7
5 5 5 8 8 6

size 8

9 9 9 8 8 8
8 8 8 7 7 7
7 7 7 7 7 7
8 8 8 8 9 9
5 5 5 5 6 3
5 5 5 3 3 3

size 9

the highlighted values forms the maximum size basin.

So the problem is

To partition the map into basins. In particular, given a map of elevations, your code should partition the map into basins and output the sizes of largest basin. We need to highlight the maximum size basin.

IF the problem were had this assumption

"If a cell is not a sink, you may assume it has a unique lowest neighbor and that this neighbor will be lower than the cell"

then i can think of this solution

    Each array element is a node in a graph. Construct the graph adding edges between the nodes:
  1 If node A is the smallest among all of its own neighbors, don't add an edge (it's a sink)
  2 There is an edge between two neighbors A and B iff A is the smallest of all neighbors of B.
  3 Finally traverse the graph using BFS or DFS and count the elements in the connected components.

uptill now i had implemented 3rd part of the algorithm

#include<iostream>
#include<vector>
#include<string.h>
#include<cstdio>
#include<algorithm>
#include<queue>
using namespace std;
int cv[1000]; // array stores number of nodes in each connected components
int main()
{
queue<int>q;
bool visited[100000];

int t,i,j,x,y,cvindex=0;
int n,e;
cin>>t;
while(t--)
{
scanf("%d%d",&n,&e);
vector< vector<int> >G(n);
memset(visited,0,sizeof(visited));

for(i=0;i<e;i++)
{
scanf("%d%d",&x,&y);
G[x].push_back(y);
G[y].push_back(x);
}

int ans=0;
for(i=0;i<n;i++)
{
if(!visited[i]) 
{
q.push(i);
visited[i]=1;
cv[cvindex]++;

while(!q.empty())
{
int p=q.front();
q.pop();
for(j=0;j<G[p].size();j++)
{
if(!visited[G[p][j]])
{
visited[G[p][j]]=1;
q.push(G[p][j]);
cv[cvindex]++;
}
}
}
ans++;
cvindex++;
}
}
printf("%d\n",ans);
sort(cv,cv+cvindex);
for(int zz=0;zz<cvindex;zz++)
printf("%d ",cv[zz]);
}
}   

Time complexity O(n*m)

But how to approach the above problem without the assumption? I want the almost similar approach with slight modification.

Other algorithms are welcomed.

Moreover does there exist better algorithm in terms of time complexity?

like image 990
prashantitis Avatar asked Oct 21 '22 06:10

prashantitis


2 Answers

This is my working code. I also commented my each and every step for your understanding. If you still found some help you can ask.

Algorithm

  1. First store index according to their heights.
  2. Then iterate from smallest height to largest height.
  3. If current index is not visited yet, then make it Basin surface(where water can collect), and make all neighbours whose height is larger than this as Non Basin surface.
  4. Repeat step 3 till all indexes are visited.
  5. Then after decaring states of each index. We need to find the largest basin surface. That we can find by using DFS.

Time Complexity : O(ROWS*COLUMNS)

#include<iostream>
#include<vector>
#include<string.h>
#include<climits>

#define BASIN 1
#define NOT_BASIN 2
#define NOT_DEFINED_YET 3

#define ROW 1000
#define COLUMN 1000
#define MAXIMUM_HEIGHT_POSSIBLE 1000

using namespace std;

int heights[ROW][COLUMN];  // It stores the height
int maximumBasin[ROW][COLUMN]; // It stores the state of each index, Total 3 states possible, ( BASIN, NOT_BASIN, NOT_DEFINED_YET )
bool alreadyVisited[ROW][COLUMN]; // True, if currect index visited, otherwise false.
vector< pair<int, int> > heightsCoordinates[MAXIMUM_HEIGHT_POSSIBLE]; // It stores all the indexs of given height.
int N, M, maxHeightPossible;

int dx[] = {0, 1, 1, 1, 0, -1, -1, -1};
int dy[] = {1, 1, 0, -1, -1, -1, 0, 1};

bool isValidLocation(int x, int y) {
    if(x < 0 || x > M || y < 0 || y > N || alreadyVisited[x][y] == true) return false;
    return true;
}

void DFS_FOR_MARKING_WITH_GIVEN_VALUE(int value, int x, int y) {
    maximumBasin[x][y] = value;
    alreadyVisited[x][y] = true;
    for(int i = 0; i < 8; i++) if( isValidLocation(x + dx[i], y + dy[i]) && heights[x + dx[i]][y + dy[i]] >= heights[x][y] ) DFS_FOR_MARKING_WITH_GIVEN_VALUE(value, x + dx[i], y + dy[i]);
}

void DFS_FOR_COUNTING_BASINS_TOGETHER(int &cnt, int x, int y) {
    cnt++;
    alreadyVisited[x][y] = true;
    for(int i = 0; i < 8; i++) if( isValidLocation(x+dx[i], y+dy[i]) && maximumBasin[x + dx[i]][y + dy[i]] ==  BASIN ) DFS_FOR_COUNTING_BASINS_TOGETHER(cnt, x + dx[i], y + dy[i]);
}

void printBasin() {
    for(int i = 0; i < M; i++) {
        for(int j = 0; j < N; j++) cout << maximumBasin[i][j] << "  ";
        cout << endl;
    }
}

main() {

    cin >> M >> N >> maxHeightPossible;
    int x, y;
    int maximumCounts = INT_MIN;
    int cntTemp = 0;

    /**
     Take input and set NOT_DEFINED_YET for maximumBasin.
    **/
    for(int i = 0; i < M; i++) {
        for(int j = 0; j < N; j++) {
             cin >> heights[i][j];
             maximumBasin[i][j] = NOT_DEFINED_YET;
             heightsCoordinates[ heights[i][j] ].push_back(pair<int, int>(i, j));
        }
    }

    /**
    Iterate from smallest to largest height.
    If current index is  "NOT_DEFINED_YET" (means it is the candidate index where water can collect).  Water will come here from all neighbourhood whose height is greater than this.
    For that I call DFS_FOR_MARKING_WITH_GIVEN_VALUE function.
    **/
    for( int i = 0; i <= maxHeightPossible; i++ ){
        if(heightsCoordinates[i].size() == 0) continue;
        for(int j = 0; j < heightsCoordinates[i].size(); j++) {
            x = heightsCoordinates[i][j].first;
            y = heightsCoordinates[i][j].second;
            if( maximumBasin[x][y] == NOT_DEFINED_YET ) {
                maximumBasin[x][y] = BASIN;
                alreadyVisited[x][y] = true;
                for(int k = 0; k < 8; k++) {
                    if( isValidLocation( x + dx[k], y + dy[k] ) ) {
                        if ( heights[x + dx[k]][ y + dy[k]] > heights[x][y] ) {
                            DFS_FOR_MARKING_WITH_GIVEN_VALUE(NOT_BASIN, x + dx[k], y + dy[k]);
                        }
                    }
                }
            }
            else {
                // If  it is set by BASIN or NOT_BASIN, Shows already processed before.
            }
        }
    }

    //printBasin();

    memset(alreadyVisited, 0, sizeof(alreadyVisited));

    /**
        It simply counts basins which are together.
    **/
    for(int i = 0; i < M; i++) {
        for(int j = 0; j < N; j++) {
            if( alreadyVisited[i][j] == false && maximumBasin[i][j] == BASIN) {
                DFS_FOR_COUNTING_BASINS_TOGETHER(cntTemp, i, j);
                //cout << cntTemp << endl;
                if(cntTemp > maximumCounts ) maximumCounts = cntTemp;
                cntTemp = 0;
            }
        }
    }

    /**
        This is our final Answer.
    **/
    cout << maximumCounts << endl;
    return 0;
}
like image 89
devsda Avatar answered Oct 27 '22 10:10

devsda


Construct the elevation chart as a graph where each of the elements of your 2d array is a node. Also, there's a directed edge from node u to node v, if the elevation of node u >= elevation of node v. Construct the SCC of this graph and pick the largest component. That will be the basin you are looking for.

like image 29
Pradhan Avatar answered Oct 27 '22 09:10

Pradhan