Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading XML using Python minidom and iterating over each node

Tags:

I have an XML structure that looks like the following, but on a much larger scale:

<root>     <conference name='1'>         <author>             Bob         </author>         <author>             Nigel         </author>     </conference>     <conference name='2'>         <author>             Alice         </author>         <author>             Mary         </author>     </conference> </root> 

For this, I used the following code:

dom = parse(filepath) conference=dom.getElementsByTagName('conference') for node in conference:     conf_name=node.getAttribute('name')     print conf_name     alist=node.getElementsByTagName('author')     for a in alist:         authortext= a.nodeValue         print authortext 

However, the authortext that is printed out is 'None.' I tried messing around with using variations like what is below, but it causes my program to break.

authortext=a[0].nodeValue 

The correct output should be:

1 Bob Nigel 2 Alice Mary 

But what I get is:

1 None None 2 None None 

Any suggestions on how to tackle this problem?

like image 294
GobiasKoffi Avatar asked Sep 11 '09 16:09

GobiasKoffi


People also ask

How do you traverse XML in Python?

There are two ways to parse the file using 'ElementTree' module. The first is by using the parse() function and the second is fromstring() function. The parse () function parses XML document which is supplied as a file whereas, fromstring parses XML when supplied as a string i.e within triple quotes.

What is XML parsing in Python?

XML stands for eXtensible Markup Language. It was designed to store and transport small to medium amounts of data and is widely used for sharing structured information. Python enables you to parse and modify XML document. In order to parse XML document you need to have the entire XML document in memory.


1 Answers

your authortext is of type 1 (ELEMENT_NODE), normally you need to have TEXT_NODE to get a string. This will work

a.childNodes[0].nodeValue 
like image 63
SilentGhost Avatar answered Oct 01 '22 10:10

SilentGhost