There are diverse rectangles in given 1000 x 1000 arrays. in <Figure 1>
, the serial “1” displayed as yellow cell is the pattern of rectangle. The minimum size of rectangle in <Figure 1>
is 3 x 3 displayed as green cell.
There should be at least one of ‘0’ inside the rectangle.
But, in this array, there exists unclosed shape or straight line pattern also.
(The initial value of array is ‘0’, and the patterns are represented a series of ‘1’. They are not overlapped or included each other.)
What could be an efficient algorithm to find the complete regtangles in the array except unclosed shape or straight line? For example in the figure above the number of complete rectangles is 3
This is quite simple. If you have n
squares, you can count the rectangles in O(n)
.
Assumptions:
You would need extra memory as big as the input. Let's call this visited
and initialize with 0.
Let's first construct a helper function:
is_rectangle(square s)
from s, go right while you see 1s and visited is 0
mark them as 1 in visited
if less than 3 steps
return 0
then, go down while you see 1s and visited is 0
mark them as 1 in visited
if less than 3 steps
return 0
then, go left while you see 1s and visited is 0
mark them as 1 in visited
if less than 3 steps
return 0
then, go up while you see 1s and visited is 0
mark them as 1 in visited
if then one above is not s
return 0
return 1
This function basically traces the 1s in the directions of right-down-left-up and checks if the conditions are met (length at least 3 and reaching the starting position). It also marks the visited squares.
The important thing to notice about this function, is that it works correctly only if the initial square is the top-left corner.
Now, the solution to the problem is easy:
num_rectangles = 0
initialize visited to 0 (rows x columns)
for r in rows
for c in columns
if visitied[r][c] or image[r][c] == 0
continue
num_rectangles += is_rectangle(<r,c>)
return num_rectangles
Here is how the algorithm executes:
1. Failed (and marked) part of a bad rectangle
2. Found (and marked) a rectangle.
3. Failed on a single square (of a vertical line)
4. Failed on a single square (of a vertical line)
5. Failed on a single square (of a vertical line)
6. After many similar steps, found the next rectangle.
7. And the next rectangle
8. Algorithm end
The following O(n) algorithm will work on any 2D matrix of 0/1 values (that is, intersecting/overlapping rectangles are allowed, as are arbitrary non-rectangular open/closed shapes). The definition of a rectangle I'm using here is "the interior is entirely made up of 0-cells" (so e.g. if one rectangle entirely contains another, only the inner rectangle will be found; if containing rectangles should also be considered, then each found rectangle can be deleted and the algorithm restarted). It's based on the observation that each 0-cell can be in the interior of at most one 1-rectangle.
I use the convention that x = 0 is the leftmost position and y = 0 is the topmost position.
If you can have only rectangular shapes in your array, it is equivalent to a classic problem of computation on binary images: just apply a standard algorithm for connected components. You label only connected components of 0's, and count them.
See http://en.wikipedia.org/wiki/Connected-component_labeling for example. This type of algorithm is quite simple on images but take some amount of memory (same size as your input array, of type short or int). Be careful of connectivity: if you choose 4-connectivity, you will count enclosed rectangles even if some corners are missing. But the algorithm is simpler than with 8-connectivity.
If you can have more complex enclosed shapes, just add a post-processing: for each connected component, count the number of pixels inside the bounding box of the component (if the two numbers are equal, you have a rectangle)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With