I am trying to iterate over all nodes in a tree using ElementTree.
I do something like:
tree = ET.parse("/tmp/test.xml") root = tree.getroot() for child in root: ### do something with child
The problem is that child is an Element object and not ElementTree object, so I can't further look into it and recurse to iterate over its elements. Is there a way to iterate differently over "root" so that it iterates over the top level nodes in the tree (immediate children) and return the same class as root itself?
To iterate over all nodes, use the iter method on the ElementTree , not the root Element. The root is an Element, just like the other elements in the tree and only really has context of its own attributes and children. The ElementTree has the context for all Elements.
ElementTree is an important Python library that allows you to parse and navigate an XML document. Using ElementTree breaks down the XML document in a tree structure that is easy to work with. When in doubt, print it out ( print(ET.
To read an XML file using ElementTree, firstly, we import the ElementTree class found inside xml library, under the name ET (common convension). Then passed the filename of the xml file to the ElementTree. parse() method, to enable parsing of our xml file. Then got the root (parent tag) of our xml file using getroot().
To iterate over all nodes, use the iter
method on the ElementTree
, not the root Element.
The root is an Element, just like the other elements in the tree and only really has context of its own attributes and children. The ElementTree
has the context for all Elements.
For example, given this xml
<?xml version="1.0"?> <data> <country name="Liechtenstein"> <rank>1</rank> <year>2008</year> <gdppc>141100</gdppc> <neighbor name="Austria" direction="E"/> <neighbor name="Switzerland" direction="W"/> </country> <country name="Singapore"> <rank>4</rank> <year>2011</year> <gdppc>59900</gdppc> <neighbor name="Malaysia" direction="N"/> </country> <country name="Panama"> <rank>68</rank> <year>2011</year> <gdppc>13600</gdppc> <neighbor name="Costa Rica" direction="W"/> <neighbor name="Colombia" direction="E"/> </country> </data>
You can do the following
>>> import xml.etree.ElementTree as ET >>> tree = ET.parse('test.xml') >>> for elem in tree.iter(): ... print elem ... <Element 'data' at 0x10b2d7b50> <Element 'country' at 0x10b2d7b90> <Element 'rank' at 0x10b2d7bd0> <Element 'year' at 0x10b2d7c50> <Element 'gdppc' at 0x10b2d7d10> <Element 'neighbor' at 0x10b2d7e90> <Element 'neighbor' at 0x10b2d7ed0> <Element 'country' at 0x10b2d7f10> <Element 'rank' at 0x10b2d7f50> <Element 'year' at 0x10b2d7f90> <Element 'gdppc' at 0x10b2d7fd0> <Element 'neighbor' at 0x10b2db050> <Element 'country' at 0x10b2db090> <Element 'rank' at 0x10b2db0d0> <Element 'year' at 0x10b2db110> <Element 'gdppc' at 0x10b2db150> <Element 'neighbor' at 0x10b2db190> <Element 'neighbor' at 0x10b2db1d0>
Adding to Robert Christie's answer it is possible to iterate over all nodes using fromstring()
by converting the Element to an ElementTree:
import xml.etree.ElementTree as ET e = ET.ElementTree(ET.fromstring(xml_string)) for elt in e.iter(): print "%s: '%s'" % (elt.tag, elt.text)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With