I have annotations in xml files such as this one, which follows the PASCAL VOC convention:
<annotation>
<folder>training</folder>
<filename>chanel1.jpg</filename>
<source>
<database>synthetic initialization</database>
<annotation>PASCAL VOC2007</annotation>
<image>synthetic</image>
<flickrid>none</flickrid>
</source>
<owner>
<flickrid>none</flickrid>
<name>none</name>
</owner>
<size>
<width>640</width>
<height>427</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>chanel</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>344</xmin>
<ymin>10</ymin>
<xmax>422</xmax>
<ymax>83</ymax>
</bndbox>
</object>
<object>
<name>chanel</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>355</xmin>
<ymin>165</ymin>
<xmax>443</xmax>
<ymax>206</ymax>
</bndbox>
</object>
</annotation>
What is the cleanest way of retrieving for example the fields filename
and bndbox
in Python?
I was trying to ElementTree, which seems to be the official Python solution, but I can't make it work.
My code so far:
from xml.etree import ElementTree as ET
tree = ET.parse("data/all/annotations/" + file)
fn = tree.find('filename').text
boxes = tree.findall('bndbox')
this produces
fn == 'chanel1.jpg'
boxes == []
So it succesfully extracts the filename
field, but not the bndbox
'es.
Pascal Visual Object Classes(VOC) Pascal VOC is an XML file, unlike COCO which has a JSON file. In Pascal VOC we create a file for each of the image in the dataset. In COCO we have one file each, for entire dataset for training, testing and validation. The bounding Box in Pascal VOC and COCO data formats are different.
The following code snippet is an example of a PASCAL VOC XML annotation: Based on its specifications, the annotations are to be defined in human-readable XML format with the same name as the image (except for extension) It should have the following items: folder — the parent directory of the image.
GitHub - AndrewCarterUK/pascal-voc-writer: A python library for generating annotations in the PASCAL VOC format. This library can be used to create image annotation XML files in the PASCAL VOC file format.
The PASCAL Visual Object Classes (VOC) project is one of the earliest computer vision project that aims to standardize the datasets and annotations format. The annotations can be used for image classification and object detection tasks.
The annotations can be used for image classification and object detection tasks. The following code snippet is an example of a PASCAL VOC XML annotation: Based on its specifications, the annotations are to be defined in human-readable XML format with the same name as the image (except for extension) It should have the following items:
That's a quite easy solution for your problem:
This will return your box coordinates in a nested list [xmin, ymin, xmax, ymax] and the filename Once I struggled with bndbox tags which where mixed up (ymin, xmin,...) or any other strange combinations, so this code read the tags not only the position.
Finally I updated the code. Thanks to craq and Pritesh Gohil, you were absolutely right.
Hope it helps...
import xml.etree.ElementTree as ET
def read_content(xml_file: str):
tree = ET.parse(xml_file)
root = tree.getroot()
list_with_all_boxes = []
for boxes in root.iter('object'):
filename = root.find('filename').text
ymin, xmin, ymax, xmax = None, None, None, None
ymin = int(boxes.find("bndbox/ymin").text)
xmin = int(boxes.find("bndbox/xmin").text)
ymax = int(boxes.find("bndbox/ymax").text)
xmax = int(boxes.find("bndbox/xmax").text)
list_with_single_boxes = [xmin, ymin, xmax, ymax]
list_with_all_boxes.append(list_with_single_boxes)
return filename, list_with_all_boxes
name, boxes = read_content("file.xml")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With