Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get better parse error message from ElementTree

If I try to parse a broken XML the exception shows the line number. Is there a way to show the XML context?

I want to see the xml tags before and after the broken part.

Example:

import xml.etree.ElementTree as ET
tree = ET.fromstring('<a><b></a>')

Exception:

Traceback (most recent call last):
  File "tmp/foo.py", line 2, in <module>
    tree = ET.fromstring('<a><b></a>')
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1300, in XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
    raise err
xml.etree.ElementTree.ParseError: mismatched tag: line 1, column 8

Something like this would be nice:

ParseError:
<a><b></a>
=====^
like image 530
guettli Avatar asked Jan 05 '15 12:01

guettli


1 Answers

You could make a helper function to do this:

import sys
import io
import itertools as IT
import xml.etree.ElementTree as ET
PY2 = sys.version_info[0] == 2
StringIO = io.BytesIO if PY2 else io.StringIO

def myfromstring(content):
    try:
        tree = ET.fromstring(content)
    except ET.ParseError as err:
        lineno, column = err.position
        line = next(IT.islice(StringIO(content), lineno))
        caret = '{:=>{}}'.format('^', column)
        err.msg = '{}\n{}\n{}'.format(err, line, caret)
        raise 
    return tree

myfromstring('<a><b></a>')

yields

xml.etree.ElementTree.ParseError: mismatched tag: line 1, column 8
<a><b></a>
=======^
like image 188
unutbu Avatar answered Oct 04 '22 22:10

unutbu