Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I convert form [xmin ymin xmax ymax] to [x y width height] normalized in image?

I am building a custom vision application with Microsoft's CustomVision.ai.

I am using this tutorial.

When you tag images in object detection projects, you need to specify the region of each tagged object using normalized coordinates.

I have an XML file containing the annotations about the image, e.g. named sample_1.jpg:

<annotation>
        <filename>sample_1.jpg</filename>
    <size>
        <width>410</width>
        <height>400</height>
        <depth>3</depth>
    </size>
    <object>
        <bndbox>
            <xmin>159</xmin>
            <ymin>15</ymin>
            <xmax>396</xmax>
            <ymax>302</ymax>
        </bndbox>
    </object>
</annotation>

I have to convert the bounding box coordinates from xmin,xmax,ymin,ymax to x,y,w,h coordinates normalized according to the provided tutorial.

Can anyone provide me a conversion function?

like image 672
glima Avatar asked Nov 16 '25 22:11

glima


2 Answers

Assuming x/ymin and x/ymax are your bounding corners, top left and bottom right respectively. Then:

x = xmin
y = ymin
w = xmax - xmin
h = ymax - ymin

You then need to normalize these, which means give them as a proportion of the whole image, so simple divide each value by its respective size from the values above:

x = xmin / width
y = ymin / height
w = (xmax - xmin) / width
h = (ymax - ymin) / height

This assumes a top-left origin, you will have to apply a shift factor if this is not the case.

like image 150
N. Smith Avatar answered Nov 18 '25 12:11

N. Smith


There is a more straight-forward way to do those stuff with pybboxes. Install with,

pip install pybboxes

In your case,

import pybboxes as pbx

voc_bbox = (159, 15, 396, 302)
W, H = 410, 400  # WxH of the image
pbx.convert_bbox(voc_bbox, from_type="voc", to_type="coco")
>>> (159, 15, 237, 287)

Note that, converting to YOLO format requires the image width and height for scaling.

like image 23
null Avatar answered Nov 18 '25 13:11

null