I want to gracefully notify the user exactly where their mucked up YAML file is flawed. Line 288 of python-3.4.1/lib/python-3.4/yaml/scanner.py
is where it reports a common parsing error and handles it by throwing an exception:
raise ScannerError("while scanning a simple key", key.mark,
"could not found expected ':'", self.get_mark())
I am struggling how to report it.
try:
parsed_yaml = yaml.safe_load(txt)
except yaml.YAMLError as exc:
print ("scanner error 1")
if hasattr(exc, 'problem_mark'):
mark = exc.problem_mark
print("Error parsing Yaml file at line %s, column %s." %
(mark.line, mark.column+1))
else:
print ("Something went wrong while parsing yaml file")
return
This gives
$ yaml_parse.py
scanner error 1
Error parsing Yaml file line 1508, column 9.
But how do I get the error text and whatever is in key.mark
and the other mark?
More usefully, how do I examine the PyYaml source to figure this out? The ScannerError class seems to ignore the parameters (from scanner.py
line 32):
class ScannerError(MarkedYAMLError):
pass
Based on @Anthon's answer, this code works quite well:
try:
import yaml
except:
print ('Fatal error: Yaml library not available')
quit()
f = open ('y.yml')
txt = f.read()
try:
yml = yaml.load(txt, yaml.SafeLoader)
except yaml.YAMLError as exc:
print ("Error while parsing YAML file:")
if hasattr(exc, 'problem_mark'):
if exc.context != None:
print (' parser says\n' + str(exc.problem_mark) + '\n ' +
str(exc.problem) + ' ' + str(exc.context) +
'\nPlease correct data and retry.')
else:
print (' parser says\n' + str(exc.problem_mark) + '\n ' +
str(exc.problem) + '\nPlease correct data and retry.')
else:
print ("Something went wrong while parsing yaml file")
return
# make use of `yml`
Example outputs with mildly clobbered data:
$ yaml_parse.py
Error while parsing YAML file:
parser says
in "<unicode string>", line 1525, column 9:
- name: Curve 1
^
could not found expected ':' while scanning a simple key
Please correct data and retry.
$ yaml_parse.py
Error while parsing YAML file:
parser says
in "<unicode string>", line 1526, column 10:
curve: title 1
^
mapping values are not allowed here
Please correct data and retry.
The ScannerError
class has no methods defined (the pass
statement work like a no-op. That makes it the same in functionality as its base class MarkedYAMLError
and that is the one who stores the data. From error.py
:
class MarkedYAMLError(YAMLError):
def __init__(self, context=None, context_mark=None,
problem=None, problem_mark=None, note=None):
self.context = context
self.context_mark = context_mark
self.problem = problem
self.problem_mark = problem_mark
self.note = note
def __str__(self):
lines = []
if self.context is not None:
lines.append(self.context)
if self.context_mark is not None \
and (self.problem is None or self.problem_mark is None
or self.context_mark.name != self.problem_mark.name
or self.context_mark.line != self.problem_mark.line
or self.context_mark.column != self.problem_mark.column):
lines.append(str(self.context_mark))
if self.problem is not None:
lines.append(self.problem)
if self.problem_mark is not None:
lines.append(str(self.problem_mark))
if self.note is not None:
lines.append(self.note)
return '\n'.join(lines)
If you start with a file txt.yaml
:
hallo: 1
bye
and a test.py
:
import ruamel.yaml as yaml
txt = open('txt.yaml')
data = yaml.load(txt, yaml.SafeLoader)
you will get the not so descriptive error:
...
ruamel.yaml.scanner.ScannerError: while scanning a simple key
in "txt.yaml", line 2, column 1
could not find expected ':'
in "txt.yaml", line 3, column 1
However if you change the second line of test.py
:
import ruamel.yaml as yaml
txt = open('txt.yaml').read()
data = yaml.load(txt, yaml.SafeLoader)
you get the more interesting error description:
...
ruamel.yaml.scanner.ScannerError: while scanning a simple key
in "<byte string>", line 2, column 1:
bye
^
could not find expected ':'
in "<byte string>", line 3, column 1:
^
This difference is because get_mark()
(in reader.py
) has more context to point to if it is not handling a stream:
def get_mark(self):
if self.stream is None:
return Mark(self.name, self.index, self.line, self.column,
self.buffer, self.pointer)
else:
return Mark(self.name, self.index, self.line, self.column,
None, None)
This data goes into the context_mark
attribute. Look at that when you want to provide more context for the error. But as shown above that only works if you parse the YAML input from a buffer, not from a stream.
Searching the YAML source is a hard task, all the methods of various classes are attached to either the Loader or the Dumper of which they are parent classes. The best help
to trace this is using grep
on def method_name(
, as at least the method names are all distinctive (as they have to be for this to function).
In the above I used my enhanced version of PyYAML called ruamel.yaml, for the purpose of this answer they should work the same.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With