I am referring to Google's Tensor-Flow object detection API. I have successfully trained and tested the objects. My question is after testing I get output image with box drawn around an object, how do I get csv coordinates of these boxes? code for testing can be found on (https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb)
If I see the helper code it loads the image into numpy array:
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape(
(im_height, im_width, 3)).astype(np.uint8)
In detection it takes this array of images and give output with box drawn as follows
with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
# Definite input and output Tensors for detection_graph
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
# Each box represents a part of the image where a particular object was detected.
detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
# Score is shown on the result image, together with the class label.
detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
for image_path in TEST_IMAGE_PATHS:
image = Image.open(image_path)
# the array based representation of the image will be used later in order to prepare the
# result image with boxes and labels on it.
image_np = load_image_into_numpy_array(image)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
# Actual detection.
(boxes, scores, classes, num) = sess.run(
[detection_boxes, detection_scores, detection_classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=8)
plt.figure(figsize=IMAGE_SIZE)
plt.imshow(image_np)
I want to store the coordinates of these green boxes in a csv file.What is a way to do it?
The coordinates in the boxes
array ([ymin, xmin, ymax, xmax]
) are normalized. Therefore, you have to multiply them with the images width / height to obtain the original values.
To achieve this, you can do something like the following:
for box in np.squeeze(boxes):
box[0] = box[0] * heigh
box[1] = box[1] * width
box[2] = box[2] * height
box[3] = box[3] * width
Then you can save the boxes to your csv using the numpy.savetxt() method:
import numpy as np
np.savetxt('yourfile.csv', boxes, delimiter=',')
As pointed out in the comments, the approach above gives a list of box coordinates. This is due to the fact, that the boxes tensor holds the coordinates of every detected region. One quick fix for me is the following, assuming that you use the default confidence acceptance threshold of 0.5:
for i, box in enumerate(np.squeeze(boxes)):
if(np.squeeze(scores)[i] > 0.5):
print("ymin={}, xmin={}, ymax={}, xmax{}".format(box[0]*height,box[1]*width,box[2]*height,box[3]*width))
This should print you the four values, and not four boxes. Each of the values represents one corner of the bounding box.
If you use another confidence acceptance threshold you have to adjust this value. Maybe you can parse the model configuration for this parameter.
To store the coordinates as CSV, you can do something like:
new_boxes = []
for i, box in enumerate(np.squeeze(boxes)):
if(np.squeeze(scores)[i] > 0.5):
new_boxes.append(box)
np.savetxt('yourfile.csv', new_boxes, delimiter=',')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With