How to detect rectangle in a rectangle?

Tags:

I am having the co-ordinates of top-left corner and the bottom-right corner of a list of rectangles say (a,b) and (c,d). I want to detect and remove rectangles which are within a rectangle. Overlapping rectangles can stay.

I have a dataset of 10,000 rectangles and I want an efficient way to solve this problem.

Currently I am doing it this way,

import pandas

data = pd.read_csv('dataset.csv')

l = list(range(len(data)-1))

for i in range(len(data)):
    length = len(l)
    if i >= length:
        break
    for j in range(i+1, length):
        if j >= len(l):
           break
        if (data.iloc[l[i]]['a'] >= data.iloc[l[j]]['a']) and (data.iloc[l[i]]['b'] <= data.iloc[l[j]]['b']) and (data.iloc[l[i]]['c'] <= data.iloc[l[j]]['c']) and (data.iloc[l[i]]['d'] >= data.iloc[l[j]]['d']):
           l.pop(j)

I have implemented this algorithm after sorting the dataset in descending order of the area of the rectangles as rectangles with bigger areas do not fit inside rectangles with smaller areas. Here, I am poping the index of the rectangle from the list l after detecting whether it is inside another rectangle. Each time an element is poped it reduces the iterations.

This is taking few hours to solve and I need an efficient way to solve it even for hundred thousand samples.

Please help!

237

asked Mar 29 '18 23:03

Sri Sanketh Uppalapati

1 Answers

Here is a little divide-and-conquer algorithm that you could try.

I assume that as soon as you can quickly enumerate every pair of colliding rectangles, you can also check whether one is fully contained in the other in constant time.

So, we have only to find colliding rectangles.

First, generalize it as follows: assume that you have two sets of rectangles A and B, and that you want to find only pairs (a, b) such that rectangle a is from A, b is from B, and a and b intersect.

First, the idea. Consider the following example of two groups A and B of rectangles partially separated by a horizontal line L:

      +----+                    +-----+
      | A1 |                    |  B1 |
      |    |     +-----+        +-----+
      +----+     |  A2 |
                 +-----+     +-----+
                             |  A3 |
_____________________________|_____|________ L
                             |     |
         +-------------------+##+  |
         |                   |##|  |
         |     B2            +##|--+    
         |                      |
         +----------------------+

The line L subdivides the sets A and B into three subsets:

A above L: {A1, A2}         (short: A>L)
A intersects L: {A3}        (short: A=L)
A below L: {}               (short: A<L)


B above L: {B1}             (B>L)
B intersects L: {}          (B=L)
B below L: {B2}             (B<L)

Observe that only rectangles from the following groups can collide:

         A<L    A=L    A>L
B<L       y      y      N
B=L       y      y      y
B>L       N      y      y

That is, if we want to find all collisions between A and B, once we have found a suitable line L, we can ignore the collisions between A<L vs. B>L and A>L vs. B<L. Thus, we obtain the following divide-and-conquer algorithm: while A and B not empty, find a suitable line that (roughly) maximizes the number of eliminated collision checks, subdivide A and B into three groups each, recursively proceed with seven subgroup collisions, ignore two subgroup combinations.

Assuming that if the rectangles are "small", and the groups A=L and B=L are mostly empty, this will (roughly) cut the size of the sets in half in every step, and we thus obtain an algorithm that on average runs in something like O(n*log(n)) instead of O(n*n).

Once you have the general case for arbitrary A and B, take the entire set of rectangles R and run the algorithm with A = R; B = R.

Here is a rough sketch in Python:

def isSubinterval(aStart, aEnd, bStart, bEnd):
  return aStart >= bStart and aEnd <= bEnd

def intersects(aStart, aEnd, bStart, bEnd):
  return not (aEnd < bStart or aStart > bEnd)

class Rectangle:
  def __init__(self, l, r, b, t):
    self.left = l
    self.right = r
    self.bottom = b
    self.top = t

  def isSubrectangle(self, other):
    return (
      isSubinterval(self.left, self.right, other.left, other.right) and
      isSubinterval(self.bottom, self.top, other.bottom, other.top)
    )

  def intersects(self, other):
    return (
      intersects(self.left, self.right, other.left, other.right) and
      intersects(self.bottom, self.top, other.bottom, other.top)
    )

  def __repr__(self):
    return ("[%f,%f]x[%f,%f]" % (self.left, self.right, self.bottom, self.top))

def boundingBox(rects):
  infty = float('inf')
  b = infty
  t = - infty
  l = infty
  r = - infty
  for rect in rects:
    b = min(b, rect.bottom)
    l = min(l, rect.left)
    r = max(r, rect.right)
    t = max(t, rect.top)
  return Rectangle(l, r, b, t)

class DividingLine:
  def __init__(self, isHorizontal, position):
    self.isHorizontal = isHorizontal
    self.position = position

  def isAbove(self, rectangle):
    if self.isHorizontal:
      return rectangle.bottom > self.position
    else:
      return rectangle.left > self.position

  def isBelow(self, rectangle):
    if self.isHorizontal:
      return rectangle.top < self.position
    else:
      return rectangle.right < self.position

def enumeratePossibleLines(boundingBox):
  NUM_TRIED_LINES = 5
  for i in range(1, NUM_TRIED_LINES + 1):
    w = boundingBox.right - boundingBox.left
    yield DividingLine(False, boundingBox.left + w / float(NUM_TRIED_LINES + 1) * i)
    h = boundingBox.top - boundingBox.bottom
    yield DividingLine(True, boundingBox.bottom + h / float(NUM_TRIED_LINES + 1) * i)

def findGoodDividingLine(rects_1, rects_2):
  bb = boundingBox(rects_1 + rects_2)
  bestLine = None
  bestGain = 0
  for line in enumeratePossibleLines(bb):
    above_1 = len([r for r in rects_1 if line.isAbove(r)])
    below_1 = len([r for r in rects_1 if line.isBelow(r)])
    above_2 = len([r for r in rects_2 if line.isAbove(r)])
    below_2 = len([r for r in rects_2 if line.isBelow(r)])

    # These groups are separated by the line, no need to 
    # perform all-vs-all collision checks on those groups!
    gain = above_1 * below_2 + above_2 * below_1
    if gain > bestGain:
      bestGain = gain
      bestLine = line
  return bestLine

# Collides all rectangles from list `rects_1` with 
# all rectangles from list `rects_2`, and invokes
# `onCollision(a, b)` on every colliding `a` and `b`.
def collideAllVsAll(rects_1, rects_2, onCollision):
  if rects_1 and rects_2: # if one list empty, no collisions
    line = findGoodDividingLine(rects_1, rects_2)
    if line:
      above_1 = [r for r in rects_1 if line.isAbove(r)]
      below_1 = [r for r in rects_1 if line.isBelow(r)]
      above_2 = [r for r in rects_2 if line.isAbove(r)]
      below_2 = [r for r in rects_2 if line.isBelow(r)]
      intersect_1 = [r for r in rects_1 if not (line.isAbove(r) or line.isBelow(r))]
      intersect_2 = [r for r in rects_2 if not (line.isAbove(r) or line.isBelow(r))]
      collideAllVsAll(above_1, above_2, onCollision)
      collideAllVsAll(above_1, intersect_2, onCollision)
      collideAllVsAll(intersect_1, above_2, onCollision)
      collideAllVsAll(intersect_1, intersect_2, onCollision)
      collideAllVsAll(intersect_1, below_2, onCollision)
      collideAllVsAll(below_1, intersect_2, onCollision)
      collideAllVsAll(below_1, below_2, onCollision)
    else:
      for r1 in rects_1:
        for r2 in rects_2:
          if r1.intersects(r2):
            onCollision(r1, r2)

Here is a little demo:

rects = [
  Rectangle(1,6,9,10),
  Rectangle(4,7,6,10),
  Rectangle(1,5,6,7),
  Rectangle(8,9,8,10),
  Rectangle(6,9,5,7),
  Rectangle(8,9,1,6),
  Rectangle(7,9,2,4),
  Rectangle(2,8,2,3),
  Rectangle(1,3,1,4)
]

def showInterestingCollision(a, b):
  if a is not b:
    if a.left < b.left:
      print("%r <-> %r collision" % (a, b))

collideAllVsAll(rects, rects, showInterestingCollision)

At least in this case, it indeed detects all the interesting collisions:

[1.000000,6.000000]x[9.000000,10.000000] <-> [4.000000,7.000000]x[6.000000,10.000000] collision
[1.000000,5.000000]x[6.000000,7.000000] <-> [4.000000,7.000000]x[6.000000,10.000000] collision
[4.000000,7.000000]x[6.000000,10.000000] <-> [6.000000,9.000000]x[5.000000,7.000000] collision
[6.000000,9.000000]x[5.000000,7.000000] <-> [8.000000,9.000000]x[1.000000,6.000000] collision
[7.000000,9.000000]x[2.000000,4.000000] <-> [8.000000,9.000000]x[1.000000,6.000000] collision
[2.000000,8.000000]x[2.000000,3.000000] <-> [8.000000,9.000000]x[1.000000,6.000000] collision
[2.000000,8.000000]x[2.000000,3.000000] <-> [7.000000,9.000000]x[2.000000,4.000000] collision
[1.000000,3.000000]x[1.000000,4.000000] <-> [2.000000,8.000000]x[2.000000,3.000000] collision

Here is a somewhat more realistic demo:

from random import random
from matplotlib import pyplot as plt

def randomRect():
  w = random() * 0.1
  h = random() * 0.1
  centerX = random() * (1 - w)
  centerY = random() * (1 - h)
  return Rectangle(
    centerX - w/2, centerX + w/2,
    centerY - h/2, centerY + h/2
  )

randomRects = [randomRect() for _ in range(0, 500)]

for r in randomRects:
  plt.fill(
    [r.left, r.right, r.right, r.left], 
    [r.bottom, r.bottom, r.top, r.top],
    'b-',
    color = 'k',
    fill = False
  )

def markSubrectanglesRed(a, b):
  if a is not b:
    if a.isSubrectangle(b):
      plt.fill(
        [a.left, a.right, a.right, a.left], 
        [a.bottom, a.bottom, a.top, a.top],
        'b-',
        color = 'r',
        alpha = 0.4
      )
      plt.fill(
        [b.left, b.right, b.right, b.left], 
        [b.bottom, b.bottom, b.top, b.top],
        'b-',
        color = 'b',
        fill = False
      )

collideAllVsAll(randomRects, randomRects, markSubrectanglesRed)

plt.show()

The plot shows all eliminated rectangles in red, and the enclosing rectangles in blue:

enter image description here

Here is a visualization of the bounding boxes (yellow) and chosen dividing lines (cyan) of the quasi-binary space partitioning for a small example with a single collision:

enter image description here

For 10000 "reasonably sized" random rectangles (with rate of intersection roughly as in the image), it computes all collisions in 18 seconds, even though the code is very far from being optimized.

170

answered Sep 19 '22 00:09

Andrey Tyukin

Related questions
                            
                                Odd issue with .isin() and strings (Python/Pandas)
                            
                                Destructing assignment order in Python
                            
                                New to Python, don't know what is wrong with my code
                            
                                How to extract values from a Pandas DataFrame, rather than a Series (without referencing the index)?
                            
                                tensorflow code TypeError: unsupported operand type(s) for *: 'int' and 'Flag'
                            
                                How to override ModelChoiceField / ModelMultipleChoiceField default widget with a template for each choice
                            
                                Unable to read table from website using Beautifulsoup
                            
                                Django TypeError: __init__() takes 1 positional argument but 2 were given
                            
                                Subtract two dataframe with the same name different index
                            
                                How to efficiently parallelize time series forecasting using dask?
                            
                                How to get a list of TestReports at the end of a py.test run?
                            
                                Python script to use data from Azure Storage Blob by stream, and update blob by stream without local file reading and uploading
                            
                                Django email not working - smtplib.SMTPServerDisconnected: Connection unexpectedly closed
                            
                                How to implement FIPS_mode() and FIPS_mode_set() in Python 3.6's ssl module?
                            
                                How could I sort the coordinates according to the serpentine in the image?
                            
                                Boto3 not uploading zip file to S3 python
                            
                                Python - Google OAuth2 - Wrong number of segments in token
                            
                                How to remove duplicates only if consecutive in a string? [duplicate]
                            
                                How to pass variables in spark SQL, using python?
                            
                                Python - How do you run a .py file?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to detect rectangle in a rectangle?

Tags:

performance

python

algorithm

rectangles

computational-geometry

Sri Sanketh Uppalapati

People also ask

1 Answers

Andrey Tyukin

Recent Activity

Donate For Us