Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to recursively iterate over XML tags in Python using ElementTree?

Tags:

python

xml

I am trying to iterate over all nodes in a tree using ElementTree.

I do something like:

  tree = ET.parse("/tmp/test.xml")    root = tree.getroot()    for child in root:        ### do something with child 

The problem is that child is an Element object and not ElementTree object, so I can't further look into it and recurse to iterate over its elements. Is there a way to iterate differently over "root" so that it iterates over the top level nodes in the tree (immediate children) and return the same class as root itself?

like image 814
kloop Avatar asked Jan 12 '14 12:01

kloop


People also ask

How do I iterate over an XML tag in Python?

To iterate over all nodes, use the iter method on the ElementTree , not the root Element. The root is an Element, just like the other elements in the tree and only really has context of its own attributes and children. The ElementTree has the context for all Elements.

What is ElementTree in Python?

ElementTree is an important Python library that allows you to parse and navigate an XML document. Using ElementTree breaks down the XML document in a tree structure that is easy to work with. When in doubt, print it out ( print(ET.

How do you read a specific tag in an XML file in Python?

To read an XML file using ElementTree, firstly, we import the ElementTree class found inside xml library, under the name ET (common convension). Then passed the filename of the xml file to the ElementTree. parse() method, to enable parsing of our xml file. Then got the root (parent tag) of our xml file using getroot().


2 Answers

To iterate over all nodes, use the iter method on the ElementTree, not the root Element.

The root is an Element, just like the other elements in the tree and only really has context of its own attributes and children. The ElementTree has the context for all Elements.

For example, given this xml

<?xml version="1.0"?> <data>     <country name="Liechtenstein">         <rank>1</rank>         <year>2008</year>         <gdppc>141100</gdppc>         <neighbor name="Austria" direction="E"/>         <neighbor name="Switzerland" direction="W"/>     </country>     <country name="Singapore">         <rank>4</rank>         <year>2011</year>         <gdppc>59900</gdppc>         <neighbor name="Malaysia" direction="N"/>     </country>     <country name="Panama">         <rank>68</rank>         <year>2011</year>         <gdppc>13600</gdppc>         <neighbor name="Costa Rica" direction="W"/>         <neighbor name="Colombia" direction="E"/>     </country> </data> 

You can do the following

>>> import xml.etree.ElementTree as ET >>> tree = ET.parse('test.xml') >>> for elem in tree.iter(): ...     print elem ...  <Element 'data' at 0x10b2d7b50> <Element 'country' at 0x10b2d7b90> <Element 'rank' at 0x10b2d7bd0> <Element 'year' at 0x10b2d7c50> <Element 'gdppc' at 0x10b2d7d10> <Element 'neighbor' at 0x10b2d7e90> <Element 'neighbor' at 0x10b2d7ed0> <Element 'country' at 0x10b2d7f10> <Element 'rank' at 0x10b2d7f50> <Element 'year' at 0x10b2d7f90> <Element 'gdppc' at 0x10b2d7fd0> <Element 'neighbor' at 0x10b2db050> <Element 'country' at 0x10b2db090> <Element 'rank' at 0x10b2db0d0> <Element 'year' at 0x10b2db110> <Element 'gdppc' at 0x10b2db150> <Element 'neighbor' at 0x10b2db190> <Element 'neighbor' at 0x10b2db1d0> 
like image 159
Robert Christie Avatar answered Sep 29 '22 19:09

Robert Christie


Adding to Robert Christie's answer it is possible to iterate over all nodes using fromstring() by converting the Element to an ElementTree:

import xml.etree.ElementTree as ET  e = ET.ElementTree(ET.fromstring(xml_string)) for elt in e.iter():     print "%s: '%s'" % (elt.tag, elt.text) 
like image 43
ssjadon Avatar answered Sep 29 '22 18:09

ssjadon