How is the IoU calculated for multiple bounding box predictions in Tensorflow Object Detection API?

Question

How is the IoU metric calculated for multiple bounding box predictions in Tensorflow Object Detection API ?

Austin Ulfers · Accepted Answer

Not sure exactly how TensorFlow does it but here is one way that I recently got it to work since I didn't find a good solution online. I used numpy matrices to get the IoU, & other metrics (TP, FP, TN, FN) for multi-object detection.

Lets say for this example that your image is 6x6.

import cv2

empty_array = np.zeros(36).reshape([6, 6])

array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]])

And you have the ground truth for 2 objects, one in the bottom left of the image and one smaller one in the top right.

bbox_actual_obj1 = [[0, 3], [2, 5]] # top left coord & bottom right coord
bbox_actual_obj2 = [[4, 0], [5, 1]]

Using OpenCV, you can add these objects to a copy of the empty image array.

actual = empty.copy()
actual = cv2.rectangle(
    actual,
    bbox_actual_obj1[0],
    bbox_actual_obj1[1],
    1,
    -1
)
actual = cv2.rectangle(
    actual,
    bbox_actual_obj2[0],
    bbox_actual_obj2[1],
    1,
    -1
)

array([[0., 0., 0., 0., 1., 1.],
       [0., 0., 0., 0., 1., 1.],
       [0., 0., 0., 0., 0., 0.],
       [1., 1., 1., 0., 0., 0.],
       [1., 1., 1., 0., 0., 0.],
       [1., 1., 1., 0., 0., 0.]])

Now let's say that below are our predicted bounding boxes:

bbox_pred_obj1 = [[1, 3], [3, 5]] # top left coord & bottom right coord
bbox_pred_obj2 = [[3, 0], [5, 2]]

Now we do the same thing as above but change the value we assign within the array.

pred = empty.copy()
pred = cv2.rectangle(
    pred,
    bbox_person2_car1[0],
    bbox_person2_car1[1],
    2,
    -1
)
pred = cv2.rectangle(
    pred,
    bbox_person2_car2[0],
    bbox_person2_car2[1],
    2,
    -1
)

array([[0., 0., 0., 2., 2., 2.],
       [0., 0., 0., 2., 2., 2.],
       [0., 0., 0., 2., 2., 2.],
       [0., 2., 2., 2., 0., 0.],
       [0., 2., 2., 2., 0., 0.],
       [0., 2., 2., 2., 0., 0.]])

If we convert these arrays to matrices and add them, we get the following result

actual_matrix = np.matrix(actual)
pred_matrix = np.matrix(pred)
combined = actual_matrix + pred_matrix

matrix([[0., 0., 0., 2., 3., 3.],
        [0., 0., 0., 2., 3., 3.],
        [0., 0., 0., 2., 2., 2.],
        [1., 3., 3., 2., 0., 0.],
        [1., 3., 3., 2., 0., 0.],
        [1., 3., 3., 2., 0., 0.]])

Now all we need to do is count the amount of each number in the combined matrix to get the TP, FP, TN, FN rates.

combined = np.squeeze(
    np.asarray(
        pred_matrix + actual_matrix
    )
)
unique, counts = np.unique(combined, return_counts=True)
zipped = dict(zip(unique, counts))

{0.0: 15, 1.0: 3, 2.0: 8, 3.0: 10}

Legend:

True Negative: 0
False Negative: 1
False Positive: 2
True Positive/Intersection: 3
Union: 1 + 2 + 3

IoU: 0.48 10/(3 + 8 + 10)
Precision: 0.56 10/(10 + 8)
Recall: 0.77 10/(10 + 3)
F1: 0.65 10/(10 + 0.5 * (3 + 8))

How is the IoU calculated for multiple bounding box predictions in Tensorflow Object Detection API?

Tags:

tensorflow

object-detection

Nagarjun Gururaj

1 Answers

Austin Ulfers

Recent Activity

Donate For Us

How is the IoU calculated for multiple bounding box predictions in Tensorflow Object Detection API?

Tags:

tensorflow

object-detection

Nagarjun Gururaj

1 Answers

Austin Ulfers

Related questions

Recent Activity

Donate For Us