I am reading an xml file using Python. But my xml file contains &
characters, because of which while running my Python code, it gives the following error:
xml.parsers.expat.ExpatError: not well-formed (invalid token):
Is there a way to ignore the &
check by python?
The most common cause is encoding errors. There are several basic approaches to solving this: escaping problematic characters ( < becomes < , & becomes & , etc.), escaping entire blocks of text with CDATA sections, or putting an encoding declaration at the start of the feed.
"XML Parsing Error" occurs when something is trying to read the XML, not when it is being generated. Also, "not well-formed" usually refers to errors in the structure of the document, such as a missing end-tag, not the characters it contains.
To read an XML file using ElementTree, firstly, we import the ElementTree class found inside xml library, under the name ET (common convension). Then passed the filename of the xml file to the ElementTree. parse() method, to enable parsing of our xml file. Then got the root (parent tag) of our xml file using getroot().
No, you can't ignore the check. Your 'xml file' is not an XML file - to be an XML file, the ampersand would have to be escaped. Therefore, no software that is designed to read XML files will parse it without error. You need to correct the software that generated this file so that it generates proper ("well-formed") XML. All the benefits of using XML for interchange disappear entirely if people start sending stuff that isn't well-formed and people receiving it try to patch it up.
For me adding the line "<?xml version='1.0' encoding='iso-8859-1'?>
" in front the string is did the trick.
>>> text = '''<?xml version="1.0" encoding="iso-8859-1"?>
... <seuss><fish>red</fish><fish>blu\xe9</fish></seuss>'''
>>> doc = elementtree.ElementTree.fromstring(text)
Refer this page https://mail.python.org/pipermail/tutor/2006-November/050757.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With