I'm trying to parse below XML using Python ElementTree to product output as below. I'm trying to write modules for top elements to print them. However It is slightly tricky as category element may or may not have property and cataegory element may have a category element inside.
I've referred to previous question in this topic, but they did not consist of nested elements with same name
My Code: http://pastebin.com/Fsv2Xzqf
work.xml:
<suite id="1" name="MainApplication">
<displayNameKey>my Application</displayNameKey>
<displayName>my Application</displayName>
<application id="2" name="Sub Application1">
<displayNameKey>sub Application1</displayNameKey>
<displayName>sub Application1</displayName>
<category id="2423" name="about">
<displayNameKey>subApp.about</displayNameKey>
<displayName>subApp.about</displayName>
<category id="2423" name="comms">
<displayNameKey>subApp.comms</displayNameKey>
<displayName>subApp.comms</displayName>
<property id="5909" name="copyright" type="string_property" width="40">
<value>2014</value>
</property>
<property id="5910" name="os" type="string_property" width="40">
<value>Linux 2.6.32-431.29.2.el6.x86_64</value>
</property>
</category>
<property id="5908" name="releaseNumber" type="string_property" width="40">
<value>9.1.0.3.0.54</value>
</property>
</category>
</application>
</suite>
Output should be as below:
Suite: MainApplication
Application: Sub Application1
Category: about
property: releaseNumber | 9.1.0.3.0.54
category: comms
property: copyright | 2014
property: os | Linux 2.6.32-431.29.2.el6.x86_64
Any pointers in right direction would be of help.
import xml.etree.ElementTree as ET
tree = ET.ElementTree(file='work.xml')
indent = 0
ignoreElems = ['displayNameKey', 'displayName']
def printRecur(root):
"""Recursively prints the tree."""
if root.tag in ignoreElems:
return
print ' '*indent + '%s: %s' % (root.tag.title(), root.attrib.get('name', root.text))
global indent
indent += 4
for elem in root.getchildren():
printRecur(elem)
indent -= 4
root = tree.getroot()
printRecur(root)
OUTPUT:
Suite: MainApplication
Application: Sub Application1
Category: about
Category: comms
Property: copyright
Value: 2014
Property: os
Value: Linux 2.6.32-431.29.2.el6.x86_64
Property: releaseNumber
Value: 9.1.0.3.0.54
This is closest I could get in 5 minutes. You should just recursively call a processor function and that would take care. You can improve on from this point :)
You can also define handler function for each tag and put all of them in a dictionary for easy lookup. Then you can check if you have an appropriate handler function for that tag, then call that else just continue with blindly printing. For example:
HANDLERS = {
'property': 'handle_property',
<tag_name>: <handler_function>
}
def handle_property(root):
"""Takes property root element and prints the values."""
data = ' '*indent + '%s: %s ' % (root.tag.title(), root.attrib['name'])
values = []
for elem in root.getchildren():
if elem.tag == 'value':
values.append(elem.text)
print data + '| %s' % (', '.join(values))
# printRecur would get modified accordingly.
def printRecur(root):
"""Recursively prints the tree."""
if root.tag in ignoreElems:
return
global indent
indent += 4
if root.tag in HANDLERS:
handler = globals()[HANDLERS[root.tag]]
handler(root)
else:
print ' '*indent + '%s: %s' % (root.tag.title(), root.attrib.get('name', root.text))
for elem in root.getchildren():
printRecur(elem)
indent -= 4
Output with above:
Suite: MainApplication
Application: Sub Application1
Category: about
Category: comms
Property: copyright | 2014
Property: os | Linux 2.6.32-431.29.2.el6.x86_64
Property: releaseNumber | 9.1.0.3.0.54
I find this very useful rather than putting tons of if/else in the code.
If you want a barebones XML recursive tree parser snippet:
from xml.etree import ElementTree
tree = ElementTree.parse('english_saheeh.xml')
root = tree.getroot()
def walk_tree_recursive(root):
#do whatever with .tags here
for child in root:
walk_tree_recursive(child)
walk_tree_recursive(root)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With