I have a XML file:
<sentence id="en_BlueRibbonSushi_478218345:2">
<text>It has great sushi and even better service.</text>
</sentence>
<sentence id="en_BlueRibbonSushi_478218345:3">
<text>The entire staff was extremely accomodating and tended to my every need.</text>
</sentence>
<sentence id="en_BlueRibbonSushi_478218345:4">
<text>I've been to this restaurant over a dozen times with no complaints to date.</text>
</sentence>
Using XML ElementTree, I would like to insert a tag <Opinion>
that has an attribute category=
. Say I have a list of chars list = ['a', 'b', 'c']
, is it possible to incrementally asign them to each text so I have:
<sentence id="en_BlueRibbonSushi_478218345:2">
<text>It has great sushi and even better service.</text>
<Opinion category='a' />
</sentence>
<sentence id="en_BlueRibbonSushi_478218345:3">
<text>The entire staff was extremely accomodating and tended to my every need.</text>
<Opinion category='b' />
</sentence>
<sentence id="en_BlueRibbonSushi_478218345:4">
<text>I've been to this restaurant over a dozen times with no complaints to date.</text>
<Opinion category='c' />
</sentence>
I am aware I can use the sentence id attribute but this would require a lot of restructuring of my code. Basically, I'd like to be able to index each sentence entry to align with my list index.
The xml. etree. ElementTree module implements a simple and efficient API for parsing and creating XML data. Changed in version 3.3: This module will use a fast implementation whenever available.
Example Read XML File in Python To read an XML file, firstly, we import the ElementTree class found inside the XML library. Then, we will pass the filename of the XML file to the ElementTree. parse() method, to start parsing. Then, we will get the parent tag of the XML file using getroot() .
This type of tree structure is applicable to XML files as well. Therefore, the BeautifulSoup class can also be used to parse XML files directly. The installation of BeautifulSoup has already been discussed at the end of the lesson on Setting up for Python programming.
You can use the SubElement
factory function to add elements to the tree.
Assuming your XML data is in a variable called data
, this will add the elements to your document tree:
import xml.etree.ElementTree as ET
tree = ET.XML(data)
for elem, category in zip(tree.findall('sentence'), ['a', 'b', 'c']):
Opinion = ET.SubElement(elem, 'Opinion')
Opinion.set('category', category)
ET.dump(tree) # prints the tree; tree.write('output.xml') is another option
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With