I'm parsing an XML with Python (xml.dom.minidom) and I cant get the tagName of a node.
The interpreter is returning:
AttributeError: Text instance has no attribute 'tagName'
when I try to extract (for example) the string 'format' from the node:
<format>DVD</format>
I have found a couple of very similar posts here in Starckoverflow, but I still can't find the solution.
I'm aware that there might be alternative modules to deal with this issue, but my intention here is to understand WHY is it failing.
Thanks a lot in advance and best regards,
Here is my code:
from xml.dom.minidom import parse
import xml.dom.minidom
# Open XML document
xml = xml.dom.minidom.parse("movies.xml")
# collection Node
collection_node = xml.firstChild
# movie Nodes
movie_nodes = collection_node.childNodes
for m in movie_nodes:
if len(m.childNodes) > 0:
print '\nMovie:', m.getAttribute('title')
for tag in m.childNodes:
print tag.tagName # AttributeError: Text instance has no attribute 'tagName'
for text in tag.childNodes:
print text.data
And here the XML:
<collection shelf="New Arrivals">
<movie title="Enemy Behind">
<type>War, Thriller</type>
<format>DVD</format>
<year>2003</year>
<rating>PG</rating>
<stars>10</stars>
<description>Talk about a US-Japan war</description>
</movie>
<movie title="Transformers">
<type>Anime, Science Fiction</type>
<format>DVD</format>
<year>1989</year>
<rating>R</rating>
<stars>8</stars>
<description>A schientific fiction</description>
</movie>
</collection>
Similar posts:
Get node name with minidom
Element.tagName for python not working
The error was due to new lines between element nodes are considered a different node which of type TEXT_NODE (see Node.nodeType), and TEXT_NODE doesn't have tagName
attribute.
You can add a node type checking to avoid printing tagName
from text nodes :
if tag.nodeType != tag.TEXT_NODE:
print tag.tagName
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With