I would like to know the different algorithms to find the biggest square in a two dimensions map dotted with obstacles.
An example, where o
would be obstacles:
...........................
....o......................
............o..............
...........................
....o......................
...............o...........
...........................
......o..............o.....
..o.......o................
The biggest square would be (if we choose the first one):
.....xxxxxxx...............
....oxxxxxxx...............
.....xxxxxxxo..............
.....xxxxxxx...............
....oxxxxxxx...............
.....xxxxxxx...o...........
.....xxxxxxx...............
......o..............o.....
..o.......o................
What would be the fastest algorithm to find it? The one with the smallest complexity?
EDIT: I know that people are interested on the algorithm explained in the accepted answer, so I made a document that explains it a bit more, you can find it here:
https://docs.google.com/document/d/19pHCD433tYsvAor0WObxa2qusAjKdx96kaf3z5I8XT8/edit?usp=sharing
The maximum size of a square bounded by a rectangle will be the square of the shortest side of that rectangle. All you need to figure out is the height and width.
Here is how to do this in the optimal amount of time, O(nm). This is built on top of @dukeling's insight that you never need to check a solution of size less than your current known best solution.
The key is to be able to build a data structure that can answer this query in O(1) time.
To solve that problem, we'll support answering a slightly harder question, also in O(1).
It's easy to answer the square existence question with an answer from the rectangle count question.
To answer the rectangle count question, note that if you had pre-computed the answer for every rectangle that starts in the top left, then you could answer the general question for from r1, c1 to r2, c2 by a kind of clever/inclusion exclusion tactic using only rectangles that start in the top left
c1 c2
-----------------------
| | | |
| A | B | |
|_____________|____| | r1
| | | |
| C | D | |
|_____________|____| | r2
|_____________________|
We want the count of stuff inside D. In terms of our pre-computed counts from the top left.
Count(D) = Count(A ∪ B ∪ C ∪ D) - Count(A ∪ C) - Count(A ∪ B) + Count(A)
You can pre-compute all the top left rectangles in O(nm) by doing some clever row/column partial sums, but I'll leave that to you.
Then to answer the to the problem you want just involves checking possible solutions, starting with solutions that are at least as good as your known best. Your known best will only get better up to min(n, m) times total, so the best_possible increment will happen very rarely and almost all squares will be rejected in O(1) time.
best_possible = 0
for r in range(n):
for c in range(m):
while True:
# this looks O(min(n, m)), but it's amortized O(1) since best_possible
# rarely increased.
if possible(r, c, best_possible+1):
best_possible += 1
else:
break
One idea, making use of binary search.
The basic idea:
Start off in the top-left corner. See if a 1x1 square would work.
The native approach:
We can simply check every possible cell of every square at each step, but this is fairly inefficient.
The optimized approach:
When increasing the square size, we can just do a binary search over the next row and column to see if that row / column contains an obstacle at any of those positions.
When moving to the right, we can do a binary search for each next column to determine if that column contains an obstacle at any of those positions.
When moving down, we can do a similar binary on each of the columns in the target position.
Implementation note:
To start off, we'd need to go through all the rows and columns and set up arrays containing the positions of the obstacles for each of them, which we can use for the binary searches.
Running time:
We do 2 binary searches to increase the square size, and the square size is maximum the size of the grid, so that is fairly small (O(min(m,n) log max(m,n))
) and gets dominated by the below.
Beyond that, for each position, we do a single binary search on a column.
So, for a grid with m
columns and n
rows, the overall complexity is O(mn log m)
.
But note how little we're actually searching below when the grid is sparse.
Example:
For your example:
012345678901234567890123456
0...........................
1....o......................
2............o..............
3...........................
4....o......................
5...............o...........
6...........................
7......o..............o.....
8..o.......o................
We'd first try a 1x1 square in the top-left corner, which works.
Then a 2x2 square. For this, we do a binary search for the range [0,1]
on the row 1, which can be represented simply by {4}
- an array of a single position corresponding to where the obstacle is. And we also do a binary search for the range [0,1]
on the column 1, which contains no obstacles, thus an empty array - {}
.
Then a 3x3 square. For this, we do a binary search for [0,2]
on the row 2, which contains 1 obstacles at position 12, thus {12}
. And we also do a binary search for [0,2]
on the column 2, which contains an obstacle at position 8, thus {8}
.
Then a 4x4 square. For this, we do a binary search for [0,3]
on the row 3 - {}
. And for [0,3]
on column 3 - {}
.
Then a 5x5 square. For this, we do a binary search for [0,4]
on the row 4 - {4}
. And for [0,4]
column 4 - {1,4}
.
Here is the first one we actually find. In the range [0,4]
, we find 4
in both the row and the column (we only really need to find one of them). So this indicates a fail.
From here we do a binary search on column 4 (again - not really necessary) for [0,4]
. Then binary search columns 5-8 for [0,4]
, none of them found, so a square starting at position 5,0
is the next possible candidate.
So from here we try to increase the square size to 5x5, which works, then 6x6 and 7x7, which works.
Then we try 8x8, which doesn't work.
And so on.
I know binary search, but how does yours work?
So we're basically doing a range search within a set of values. This is fairly easy to do. First search for the starting value of the range, then the end value. If we get to the same point, there are no values in the range.
We don't really care what values exist in the range, just whether or not there are any.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With