Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get details from PyYAML exception?

I want to gracefully notify the user exactly where their mucked up YAML file is flawed. Line 288 of python-3.4.1/lib/python-3.4/yaml/scanner.py is where it reports a common parsing error and handles it by throwing an exception:

raise ScannerError("while scanning a simple key", key.mark,
                   "could not found expected ':'", self.get_mark())

I am struggling how to report it.

try:
    parsed_yaml = yaml.safe_load(txt)

except yaml.YAMLError as exc:
    print ("scanner error 1")
    if hasattr(exc, 'problem_mark'):
        mark = exc.problem_mark
        print("Error parsing Yaml file at line %s, column %s." %
                                            (mark.line, mark.column+1))
    else:
        print ("Something went wrong while parsing yaml file")
    return

This gives

$ yaml_parse.py
scanner error 1
Error parsing Yaml file line 1508, column 9.

But how do I get the error text and whatever is in key.mark and the other mark?

More usefully, how do I examine the PyYaml source to figure this out? The ScannerError class seems to ignore the parameters (from scanner.py line 32):

class ScannerError(MarkedYAMLError):
     pass
like image 449
wallyk Avatar asked Jan 09 '23 09:01

wallyk


2 Answers

Based on @Anthon's answer, this code works quite well:

try:
    import yaml
except:
    print ('Fatal error:  Yaml library not available')
    quit()

f = open ('y.yml')
txt = f.read()

try:
    yml = yaml.load(txt, yaml.SafeLoader)

except yaml.YAMLError as exc:
    print ("Error while parsing YAML file:")
    if hasattr(exc, 'problem_mark'):
        if exc.context != None:
            print ('  parser says\n' + str(exc.problem_mark) + '\n  ' +
                str(exc.problem) + ' ' + str(exc.context) +
                '\nPlease correct data and retry.')
        else:
            print ('  parser says\n' + str(exc.problem_mark) + '\n  ' +
                str(exc.problem) + '\nPlease correct data and retry.')
    else:
        print ("Something went wrong while parsing yaml file")
    return

# make use of `yml`

Example outputs with mildly clobbered data:

$ yaml_parse.py
Error while parsing YAML file:
  parser says
  in "<unicode string>", line 1525, column 9:
      - name: Curve 1
            ^
  could not found expected ':' while scanning a simple key
Please correct data and retry.

$ yaml_parse.py
Error while parsing YAML file:
  parser says
  in "<unicode string>", line 1526, column 10:
        curve: title 1
             ^
  mapping values are not allowed here
Please correct data and retry.
like image 124
wallyk Avatar answered Jan 17 '23 01:01

wallyk


The ScannerError class has no methods defined (the pass statement work like a no-op. That makes it the same in functionality as its base class MarkedYAMLError and that is the one who stores the data. From error.py:

class MarkedYAMLError(YAMLError):
    def __init__(self, context=None, context_mark=None,
                 problem=None, problem_mark=None, note=None):
        self.context = context
        self.context_mark = context_mark
        self.problem = problem
        self.problem_mark = problem_mark
        self.note = note

    def __str__(self):
        lines = []
        if self.context is not None:
            lines.append(self.context)
        if self.context_mark is not None  \
           and (self.problem is None or self.problem_mark is None
                or self.context_mark.name != self.problem_mark.name
                or self.context_mark.line != self.problem_mark.line
                or self.context_mark.column != self.problem_mark.column):
            lines.append(str(self.context_mark))
        if self.problem is not None:
            lines.append(self.problem)
        if self.problem_mark is not None:
            lines.append(str(self.problem_mark))
        if self.note is not None:
            lines.append(self.note)
        return '\n'.join(lines)

If you start with a file txt.yaml:

hallo: 1
bye

and a test.py:

import ruamel.yaml as yaml
txt = open('txt.yaml')
data = yaml.load(txt, yaml.SafeLoader)

you will get the not so descriptive error:

...
ruamel.yaml.scanner.ScannerError: while scanning a simple key
  in "txt.yaml", line 2, column 1
could not find expected ':'
  in "txt.yaml", line 3, column 1

However if you change the second line of test.py:

import ruamel.yaml as yaml
txt = open('txt.yaml').read()
data = yaml.load(txt, yaml.SafeLoader)

you get the more interesting error description:

...
ruamel.yaml.scanner.ScannerError: while scanning a simple key
  in "<byte string>", line 2, column 1:
    bye
    ^
could not find expected ':'
  in "<byte string>", line 3, column 1:

    ^

This difference is because get_mark() (in reader.py) has more context to point to if it is not handling a stream:

def get_mark(self):
    if self.stream is None:
        return Mark(self.name, self.index, self.line, self.column,
                    self.buffer, self.pointer)
    else:
        return Mark(self.name, self.index, self.line, self.column,
                    None, None)

This data goes into the context_mark attribute. Look at that when you want to provide more context for the error. But as shown above that only works if you parse the YAML input from a buffer, not from a stream.

Searching the YAML source is a hard task, all the methods of various classes are attached to either the Loader or the Dumper of which they are parent classes. The best help to trace this is using grep on def method_name(, as at least the method names are all distinctive (as they have to be for this to function).


In the above I used my enhanced version of PyYAML called ruamel.yaml, for the purpose of this answer they should work the same.

like image 41
Anthon Avatar answered Jan 17 '23 01:01

Anthon