Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the bounding box coordinates in the TensorFlow object detection API tutorial

I am new to both Python and Tensorflow. I am trying to run the object detection tutorial file from the Tensorflow Object Detection API, but I cannot find where I can get the coordinates of the bounding boxes when objects are detected.

Relevant code:

 # The following processing is only for single image  detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])  detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0]) 

The place where I assume bounding boxes are drawn is like this:

 # Visualization of the results of detection.  vis_util.visualize_boxes_and_labels_on_image_array(       image_np,       output_dict['detection_boxes'],       output_dict['detection_classes'],       output_dict['detection_scores'],       category_index,       instance_masks=output_dict.get('detection_masks'),       use_normalized_coordinates=True,       line_thickness=8)  plt.figure(figsize=IMAGE_SIZE)  plt.imshow(image_np) 

I tried printing output_dict['detection_boxes'] but I am not sure what the numbers mean. There are a lot.

array([[ 0.56213236,  0.2780568 ,  0.91445708,  0.69120586],        [ 0.56261235,  0.86368728,  0.59286624,  0.8893863 ],        [ 0.57073039,  0.87096912,  0.61292225,  0.90354401],        [ 0.51422435,  0.78449738,  0.53994244,  0.79437423], ......         [ 0.32784131,  0.5461576 ,  0.36972913,  0.56903434],        [ 0.03005961,  0.02714229,  0.47211722,  0.44683522],        [ 0.43143299, 0.09211366,  0.58121657,  0.3509962 ]], dtype=float32) 

I found answers for similar questions, but I don't have a variable called boxes as they do. How can I get the coordinates?

like image 432
Mandy Avatar asked Feb 21 '18 20:02

Mandy


People also ask

How do you get the coordinates of the bounding boxes from the object detection model?

To make coordinates normalized, we take pixel values of x and y, which marks the center of the bounding box on the x- and y-axis. Then we divide the value of x by the width of the image and value of y by the height of the image. width and height represent the width and the height of the bounding box.

What are bounding box coordinates?

A bounding box (usually shortened to bbox) is an area defined by two longitudes and two latitudes, where: Latitude is a decimal number between -90.0 and 90.0. Longitude is a decimal number between -180.0 and 180.0.


1 Answers

I tried printing output_dict['detection_boxes'] but I am not sure what the numbers mean

You can check out the code for yourself. visualize_boxes_and_labels_on_image_array is defined here.

Note that you are passing use_normalized_coordinates=True. If you trace the function calls, you will see your numbers [ 0.56213236, 0.2780568 , 0.91445708, 0.69120586] etc. are the values [ymin, xmin, ymax, xmax] where the image coordinates:

(left, right, top, bottom) = (xmin * im_width, xmax * im_width,                                ymin * im_height, ymax * im_height) 

are computed by the function:

def draw_bounding_box_on_image(image,                            ymin,                            xmin,                            ymax,                            xmax,                            color='red',                            thickness=4,                            display_str_list=(),                            use_normalized_coordinates=True):   """Adds a bounding box to an image.   Bounding box coordinates can be specified in either absolute (pixel) or   normalized coordinates by setting the use_normalized_coordinates argument.   Each string in display_str_list is displayed on a separate line above the   bounding box in black text on a rectangle filled with the input 'color'.   If the top of the bounding box extends to the edge of the image, the strings   are displayed below the bounding box.   Args:     image: a PIL.Image object.     ymin: ymin of bounding box.     xmin: xmin of bounding box.     ymax: ymax of bounding box.     xmax: xmax of bounding box.     color: color to draw bounding box. Default is red.     thickness: line thickness. Default value is 4.     display_str_list: list of strings to display in box                       (each to be shown on its own line).     use_normalized_coordinates: If True (default), treat coordinates       ymin, xmin, ymax, xmax as relative to the image.  Otherwise treat       coordinates as absolute.   """   draw = ImageDraw.Draw(image)   im_width, im_height = image.size   if use_normalized_coordinates:     (left, right, top, bottom) = (xmin * im_width, xmax * im_width,                                   ymin * im_height, ymax * im_height) 
like image 90
MFisherKDX Avatar answered Sep 19 '22 15:09

MFisherKDX