Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading XML file and fetching its attributes value in Python

I have this XML file:

<domain type='kmc' id='007'>
  <name>virtual bug</name>
  <uuid>66523dfdf555dfd</uuid>
  <os>
    <type arch='xintel' machine='ubuntu'>hvm</type>
    <boot dev='hd'/>
    <boot dev='cdrom'/>
  </os>
  <memory unit='KiB'>524288</memory>
  <currentMemory unit='KiB'>270336</currentMemory>
  <vcpu placement='static'>10</vcpu>

Now, I want parse this and fetch its attribute value. For instance, I want to fetch the uuid field. So what should be the proper method to fetch it, in Python?

like image 333
S.Ali Avatar asked Sep 05 '12 21:09

S.Ali


People also ask

How do I read XML content in Python?

To read an XML file using ElementTree, firstly, we import the ElementTree class found inside xml library, under the name ET (common convension). Then passed the filename of the xml file to the ElementTree. parse() method, to enable parsing of our xml file. Then got the root (parent tag) of our xml file using getroot().

What is fetch data from XML file?

The page uses the XMLHttpRequest (JavaScript) object to fetch the XML file (sample. xml) then parses it in JavaScript and creates the chart. The function that parses the XML response and then uses the data to create the chart is shown below and called myXMLProcessor() (it's the XMLHttpRequest callback function).

What is attribute value XML?

An attribute element is used without any quotation and the attribute value is used in a single (' ') or double quotation (” “). An attribute name and its value should always appear in pair. An attribute value should not contain direct or indirect entity references to external entities.


1 Answers

Here's an lxml snippet that extracts an attribute as well as element text (your question was a little ambiguous about which one you needed, so I'm including both):

from lxml import etree
doc = etree.parse(filename)

memoryElem = doc.find('memory')
print memoryElem.text        # element text
print memoryElem.get('unit') # attribute

You asked (in a comment on Ali Afshar's answer) whether minidom (2.x, 3.x) is a good alternative. Here's the equivalent code using minidom; judge for yourself which is nicer:

import xml.dom.minidom as minidom
doc = minidom.parse(filename)

memoryElem = doc.getElementsByTagName('memory')[0]
print ''.join( [node.data for node in memoryElem.childNodes] )
print memoryElem.getAttribute('unit')

lxml seems like the winner to me.

like image 165
ron rothman Avatar answered Oct 29 '22 00:10

ron rothman