I have this kind of XML structure (output from the Esprima ASL converted from JSON), it can get even more nested than this (ASL.xml
):
<?xml version="1.0" encoding="UTF-8" ?>
<program>
<type>Program</type>
<body>
<type>VariableDeclaration</type>
<declarations>
<type>VariableDeclarator</type>
<id>
<type>Identifier</type>
<name>answer</name>
</id>
<init>
<type>BinaryExpression</type>
<operator>*</operator>
<left>
<type>Literal</type>
<value>6</value>
</left>
<right>
<type>Literal</type>
<value>7</value>
</right>
</init>
</declarations>
<kind>var</kind>
</body>
</program>
Usualy for XML I use the for node in
root.childNodes` but this works only for the direct children:
import xml.dom.minidom as md
dom = md.parse("ASL.xml")
root = dom.documentElement
for node in root.childNodes:
if node.nodeType == node.ELEMENT_NODE:
print node.tagName,"has value:", node.nodeValue:, "and is child of:",node.parentNode.tagName
How can I walk all the elements of the XML regardless how many nested elements are?
To iterate over all nodes, use the iter method on the ElementTree , not the root Element. The root is an Element, just like the other elements in the tree and only really has context of its own attributes and children. The ElementTree has the context for all Elements.
Example explained: Load the XML string into xmlDoc. Get the child nodes of the root element. For each child node, output the node name and the node value of the text node.
This is probably best achieved with a recursive function. Something like this should do it but I've not tested it so consider it pseudocode.
import xml.dom.minidom as md
def print_node(root):
if root.childNodes:
for node in root.childNodes:
if node.nodeType == node.ELEMENT_NODE:
print node.tagName,"has value:", node.nodeValue, "and is child of:", node.parentNode.tagName
print_node(node)
dom = md.parse("ASL.xml")
root = dom.documentElement
print_node(root)
If it's not important to use xml.dom.minidom:
import xml.etree.ElementTree as ET
tree = ET.fromstring("""...""")
for elt in tree.iter():
print "%s: '%s'" % (elt.tag, elt.text.strip())
Output:
program: ''
type: 'Program'
body: ''
type: 'VariableDeclaration'
declarations: ''
type: 'VariableDeclarator'
id: ''
type: 'Identifier'
name: 'answer'
init: ''
type: 'BinaryExpression'
operator: '*'
left: ''
type: 'Literal'
value: '6'
right: ''
type: 'Literal'
value: '7'
kind: 'var'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With