I am trying to use Python's xml.etree.ElementTree.parse() function to parse an XML file I created by exporting all of the content from a WordPress blog. However, when I try like so:
import xml.etree.ElementTree as xml
tree = xml.parse('/path/to/file.xml')
I get the following error:
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1183, in parse
tree.parse(source, parser)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
parser.feed(data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1643, in feed
self._raiseerror(v)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1507, in _raiseerror
raise err
ParseError: unbound prefix: line 189, column 1
Here's what's on line 189 of my XML file:
<atom:link rel="search" type="application/opensearchdescription+xml" href="http://blogname.wordpress.com/osd.xml" title="blog name" />
I've seen many questions about this error coming up with Android development, but I can't tell if and how that applies to my situation. Can anyone help with this?
Apologies to everyone for whom this was stupidly obvious, but it turns out I simply didn't have a namespace definition for "atom" in the document. I'm guessing that "unbound prefix" means that the prefix "atom" wasn't "bound" to a namespace definition?
Anyway, adding said definition has solved the problem. Although it makes me wonder why WordPress exports XML files without proper definitions for all of the namespaces they use...
Just in case it helps someone some day, I was also working with a WordPress XML export (WordPress eXtended RSS) file in Python and was getting the same error. In my case, WordPress had included most of the correct namespace definitions. However, the XML had iTunes podcast information as well, and the iTunes namespace declaration was not present.
I fixed it by adding xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" into the RSS declaration block. So this:
<!-- generator="WordPress/4.9.8" created="2018-08-06 03:12" -->
<rss version="2.0"
xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:wp="http://wordpress.org/export/1.2/"
>
became this:
<!-- generator="WordPress/4.9.8" created="2018-08-06 03:12" -->
<rss version="2.0"
xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:wp="http://wordpress.org/export/1.2/"
xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With