Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XML ElementTree - indexing tags

I have a XML file:

<sentence id="en_BlueRibbonSushi_478218345:2">
   <text>It has great sushi and even better service.</text>
</sentence>
<sentence id="en_BlueRibbonSushi_478218345:3">
   <text>The entire staff was extremely accomodating and tended to my every need.</text>
</sentence>
<sentence id="en_BlueRibbonSushi_478218345:4">
   <text>I&apos;ve been to this restaurant over a dozen times with no complaints to date.</text>
</sentence>

Using XML ElementTree, I would like to insert a tag <Opinion> that has an attribute category=. Say I have a list of chars list = ['a', 'b', 'c'], is it possible to incrementally asign them to each text so I have:

<sentence id="en_BlueRibbonSushi_478218345:2">
   <text>It has great sushi and even better service.</text>
   <Opinion category='a' />
</sentence>
<sentence id="en_BlueRibbonSushi_478218345:3">
   <text>The entire staff was extremely accomodating and tended to my every need.</text>
   <Opinion category='b' />
</sentence>
<sentence id="en_BlueRibbonSushi_478218345:4">
   <text>I&apos;ve been to this restaurant over a dozen times with no complaints to date.</text>
   <Opinion category='c' />
</sentence>

I am aware I can use the sentence id attribute but this would require a lot of restructuring of my code. Basically, I'd like to be able to index each sentence entry to align with my list index.

like image 471
user3058703 Avatar asked Mar 29 '17 14:03

user3058703


People also ask

What is XML Etree ElementTree in Python?

The xml. etree. ElementTree module implements a simple and efficient API for parsing and creating XML data. Changed in version 3.3: This module will use a fast implementation whenever available.

How do you read a specific tag in an XML file in Python?

Example Read XML File in Python To read an XML file, firstly, we import the ElementTree class found inside the XML library. Then, we will pass the filename of the XML file to the ElementTree. parse() method, to start parsing. Then, we will get the parent tag of the XML file using getroot() .

Can BeautifulSoup parse XML?

This type of tree structure is applicable to XML files as well. Therefore, the BeautifulSoup class can also be used to parse XML files directly. The installation of BeautifulSoup has already been discussed at the end of the lesson on Setting up for Python programming.


1 Answers

You can use the SubElement factory function to add elements to the tree. Assuming your XML data is in a variable called data, this will add the elements to your document tree:

import xml.etree.ElementTree as ET
tree = ET.XML(data)
for elem, category in zip(tree.findall('sentence'), ['a', 'b', 'c']):
    Opinion  = ET.SubElement(elem, 'Opinion')
    Opinion.set('category', category)

ET.dump(tree)  # prints the tree; tree.write('output.xml') is another option
like image 189
cco Avatar answered Nov 07 '22 23:11

cco