Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

parsing an xml file for unknown elements using python ElementTree

I wish to extract all the tag names and their corresponding data from a multi-purpose xml file. Then save that information into a python dictionary (e.g tag = key, data = value). The catch being the tags names and values are unknown and of unknown quantity.

    <some_root_name>
        <tag_x>bubbles</tag_x>
        <tag_y>car</tag_y>
        <tag...>42</tag...>
    </some_root_name>

I'm using ElementTree and can successfully extract the root tag and can extract values by referencing the tag names, but haven't been able to find a way to simply iterate over the tags and data without referencing a tag name.

Any help would be great.

Thank you.

like image 863
Markus Avatar asked Mar 20 '26 03:03

Markus


1 Answers

from lxml import etree as ET

xmlString = """
    <some_root_name>
        <tag_x>bubbles</tag_x>
        <tag_y>car</tag_y>
        <tag...>42</tag...>
    </some_root_name> """

document = ET.fromstring(xmlString)
for elementtag in document.getiterator():
   print "elementtag name:", elementtag.tag

EDIT: To read from file instead of from string

document = ET.parse("myxmlfile.xml")
like image 59
Kristofer Avatar answered Mar 21 '26 19:03

Kristofer



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!