I have an XML file like this:
<hierachy>
<att>
<Order>1</Order>
<attval>Data</attval>
<children>
<att>
<Order>1</Order>
<attval>Studyval</attval>
</att>
<att>
<Order>2</Order>
<attval>Site</attval>
</att>
</children>
</att>
<att>
<Order>2</Order>
<attval>Info</attval>
<children>
<att>
<Order>1</Order>
<attval>age</attval>
</att>
<att>
<Order>2</Order>
<attval>gender</attval>
</att>
</children>
</att>
</hierachy>
I'm trying to convert it to a CSV file like this:
Data,Studyval
Date,Site
Info,age
Info,gender
My problem is, both the parent and child names are the same - 'att'
and 'attval'
. How do I tell Python to distinguish between them both and give me the output?
I tried this:
import xml.etree.cElementTree as ET
tree = ET.parse('input.xml')
rebase = tree.getroot()
list = []
for att in rebase.findall('att'):
name = att.find('attval').text
for each_att in att.findall('attval'):
try:
val = att.find('attval').text
print name, val
except AttributeError:
print name
and it printed the same things twice.
First, you can copy and enter data of the XML file and save the data as a CSV file; secondly, you can upload an XML file to the converter and convert it to CSV without opening; Finally, it allows you to enter the URL of your XML file. You can choose the method that fits you best.
From the top menu, open the Language sub-menu, then select XML. After you choose the correct language, click File, then Save As. Choose where to save the file and click Save. Open Excel and click on File from the ribbon bar.
You can export data values to a comma-separated values (CSV) format or Extensible Markup Language (XML) file for import to another data set or use in an external system. You can also define or use export formats when exporting reference values.
Do not use the findall
function, as it will look for att tags in the whole tree. Just iterate the tree in order from top to bottom and grab the relevant elements in them.
from xml.etree import ElementTree
tree = ElementTree.parse('input.xml')
root = tree.getroot()
for att in root:
first = att.find('attval').text
for subatt in att.find('children'):
second = subatt.find('attval').text
print('{},{}'.format(first, second))
Which gives:
$ python process.py
Data,Studyval
Data,Site
Info,age
Info,gender
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With