Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find bounding box contour with largest surface area excluding intersection areas

I have an array of bounding boxes from the object detection system. They are in the format:

[[x,y], [x,y], [x,y], [x,y]]

I want to find the largest bounding box that is not intersecting with any other provided boxes nor is inside an excluded box.

I am using python, but response in any programming language is welcomed :)

Visual example

enter image description here

How I tried and failed to solve this problem.

Approach I.

Iterate over every point and find the min and max of x and y.

Then crop to a polygon using these coordinates.

The problem is that algorithm on an example image would remove the top part of the image but there is no need to because we 'missed' top left and right boxes.

Approach II.

Try to choose to crop only one side at a time, because usually in my dataset things to exclude are on one side. e.g. remove top 100px

So I calculated the min and max of x and y like before. Then the calculated area of every possible cut - left, right, top, bottom and choose one with the smallest area.

This approach failed pretty quickly when there are boxes on two sides of picture like left and right

like image 721
Damian Grzanka Avatar asked Sep 11 '21 19:09

Damian Grzanka


3 Answers

Consider a full recangle (initially the whole picture) and take away one excluded box. You will get 2x2x2x2=16 possible rectangular subdivisions, for example this one.

  ┌────────────────────────┐
  │                        │
  │                        │
  ├───────┬───────┬────────┤
  │       │  exc  │        │
  │       │ lude  │        │
  │       ├───────┴────────┤
  │       │                │
  │       │                │
  └───────┴────────────────┘

For each box in the subdivision, take away the next excluded box. Do this N times, and take the biggest box of the final step.

like image 56
Wolfgang Kuehn Avatar answered Oct 18 '22 20:10

Wolfgang Kuehn


Here's a potential solution to find the bounding box contour with the largest surface area. We have two requirements:

  1. Largest bounding box is not intersecting with any other box
  2. Largest bounding box is not inside another box

Essentially we can reword the two requirements to this:

  1. Given C1 and C2, determine if C1 and C2 intersect
  2. Given C1 and C2, check if there is a point from C1 in C2

To solve #1, we can create a contour_intersect function that uses a bitwise AND operation with np.logical_and() to detect intersection. The idea is to create two separate masks for each contour and then use the logical AND operation on them. Any points that have a positive value (1 or True) will be points of intersection. Essentially, if the entire array is False then there was no intersection between the contours. But if there is a single True, then the contours touched at some point and thus intersect.

For #2, we can create a function contour_inside and use cv2.pointPolygonTest() to determine if a point is inside, outside, or on the edge of a contour. The function returns +1, -1, or 0 to indicate if a point is inside, outside, or on the contour, respectively. We find the centroid of C1 and then check if that point is inside C2.


Here's an example to visualize the scenarios:

Input image with three contours. Nothing special here, the expected answer would be the contour with the largest area.

enter image description here

Answer:

Contour #0 is the largest

Next we add two additional contours. Contour #3 will represent the intersection scenario and contour #4 will represent the inside contour scenario.

enter image description here

Answer:

Contour #0 has failed test
Contour #1 has failed test
Contour #2 is the largest

To solve this problem, we find contours then sort using contour area from largest to smallest. Next, we compare this contour with all other contours and check the two cases. If either case fails, we dump the current contour and move onto the next largest contour. The first contour that passes both tests for all other contours is our largest bounding box contour. Normally, contour #0 would be our largest but it fails the intersection test. We then move onto contour #1 but this fails the inside test. Thus the last remaining contour that passes both tests is contour #2.

import cv2
import numpy as np

# Check if C1 and C2 intersect
def contour_intersect(original_image, contour1, contour2):
    # Two separate contours trying to check intersection on
    contours = [contour1, contour2]

    # Create image filled with zeros the same size of original image
    blank = np.zeros(original_image.shape[0:2])

    # Copy each contour into its own image and fill it with '1'
    image1 = cv2.drawContours(blank.copy(), contours, 0, 1)
    image2 = cv2.drawContours(blank.copy(), contours, 1, 1)

    # Use the logical AND operation on the two images
    # Since the two images had bitwise and applied to it,
    # there should be a '1' or 'True' where there was intersection
    # and a '0' or 'False' where it didnt intersect
    intersection = np.logical_and(image1, image2)

    # Check if there was a '1' in the intersection
    return intersection.any()

# Check if C1 is in C2
def contour_inside(contour1, contour2):
    # Find centroid of C1
    M = cv2.moments(contour1)
    cx = int(M['m10']/M['m00'])
    cy = int(M['m01']/M['m00'])

    inside = cv2.pointPolygonTest(contour2, (cx, cy), False)

    if inside == 0 or inside == -1:
        return False
    elif inside == 1:
        return True

# Load image, convert to grayscale, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Find contours, sort by contour area from largest to smallest
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
sorted_cnts = sorted(cnts, key=lambda x: cv2.contourArea(x), reverse=True)

# "Intersection" and "inside" contours
# Add both contours to test 
# --------------------------------
intersect_contour = np.array([[[230, 93]], [[230, 187]], [[326, 187]], [[326, 93]]])
sorted_cnts.append(intersect_contour)
cv2.drawContours(original, [intersect_contour], -1, (36,255,12), 3)

inside_contour = np.array([[[380, 32]], [[380, 229]], [[740, 229]], [[740, 32]]])
sorted_cnts.append(inside_contour)
cv2.drawContours(original, [inside_contour], -1, (36,255,12), 3)
# --------------------------------

# Find centroid for each contour and label contour number
for count, c in enumerate(sorted_cnts):
    M = cv2.moments(c)
    cx = int(M['m10']/M['m00'])
    cy = int(M['m01']/M['m00'])
    cv2.putText(original, str(count), (cx-5, cy+5), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (246,255,12), 3)

# Find largest bounding box contour
largest_contour_name = ""
largest_contour = ""
contours_length = len(sorted_cnts)
for i1 in range(contours_length):
    found = True
    for i2 in range(i1 + 1, contours_length):
        c1 = sorted_cnts[i1]
        c2 = sorted_cnts[i2]
        
        # Test intersection and "inside" contour
        if contour_intersect(original, c1, c2) or contour_inside(c1, c2):
            print('Contour #{} has failed test'.format(i1))
            found = False
            continue
    if found:
        largest_contour_name = i1
        largest_contour = sorted_cnts[i1]
        break
print('Contour #{} is the largest'.format(largest_contour_name))
print(largest_contour)

# Display
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.imshow('original', original)
cv2.waitKey()

Note: The assumption is that you have an array of contours from cv2.findContours() with the format like this example:

cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
sorted_cnts = sorted(cnts, key=lambda x: cv2.contourArea(x), reverse=True)
for c in sorted_cnts:
    print(c)
    print(type(c))
    x,y,w,h = cv2.boundingRect(c)
    print((x,y,w,h))

Output

[[[230  93]]

 [[230 187]]

 [[326 187]]

 [[326  93]]]
<class 'numpy.ndarray'>
(230, 93, 97, 95)

Performance note: The intersection check function suffers on the performance side since it creates three copies of the input image to draw the contours and may be slower when it comes to execution time with a greater number of contours or a larger input image size. I'll leave this optimization step to you!

like image 28
nathancy Avatar answered Oct 18 '22 19:10

nathancy


You can use the cv2.boundingRect() method to get the x, y, w, h of each bounding box, and with the x, y, w, h of each bounding box, you can use the condition x2 + w2 > x1 > x2 - w1 and y2 + h2 > y1 > y2 - h1 to check if any two bounding boxes intersect or are within each others:

import cv2
import numpy as np

def intersect(b1, b2):
    x1, y1, w1, h1 = b1
    x2, y2, w2, h2 = b2
    return x2 + w2 > x1 > x2 - w1 and y2 + h2 > y1 > y2 - h1

# Here I am generating a random array of 10 boxes in the format [[x,y], [x,y], [x,y], [x,y]]
np.random.seed(55)
boxes = np.random.randint(10, 150, (10, 4, 2)) + np.random.randint(0, 300, (10, 1, 2))

bounds = [cv2.boundingRect(box) for box in boxes]
valids = [b1 for b1 in bounds if not any(intersect(b1, b2) for b2 in bounds if b1 != b2)]
if valids:
    x, y, w, h = max(valids, key=lambda b: b[2] * b[3])
    print(f"x: {x} y: {y} w: {w} h: {h}")
else:
    print("All boxes intersect.")

Output:

x: 75 y: 251 w: 62 h: 115

For visualization:

import cv2
import numpy as np

def intersect(b1, b2):
    x1, y1, w1, h1 = b1
    x2, y2, w2, h2 = b2
    return x2 + w2 > x1 > x2 - w1 and y2 + h2 > y1 > y2 - h1

np.random.seed(55)
boxes = np.random.randint(10, 150, (10, 4, 2)) + np.random.randint(0, 300, (10, 1, 2))

bounds = [cv2.boundingRect(box) for box in boxes]
valids = [b1 for b1 in bounds if not any(intersect(b1, b2) for b2 in bounds if b1 != b2)]

img = np.zeros((500, 500), "uint8")
for x, y, w, h in bounds:
    cv2.rectangle(img, (x, y), (x + w, y + h), 255, 1)

if valids:
    x, y, w, h = max(valids, key=lambda b: b[2] * b[3])
    cv2.rectangle(img, (x, y), (x + w, y + h), 128, -1)

cv2.imshow("IMAGE", img)
cv2.waitKey(0)

Output:

enter image description here

like image 1
Ann Zen Avatar answered Oct 18 '22 21:10

Ann Zen