I have
>>> import yaml
>>> yaml.dump(u'abc')
"!!python/unicode 'abc'\n"
But I want
>>> import yaml
>>> yaml.dump(u'abc', magic='something')
'abc\n'
What magic param forces no tagging?
YAML is a data serialization format designed for human readability and interaction with scripting languages. PyYAML is a YAML parser and emitter for Python. PyYAML features a complete YAML 1.1 parser, Unicode support, pickle support, capable extension API, and sensible error messages.
PyYAML is a YAML parser and emitter for Python. Using the PyYAML module, we can perform various actions such as reading and writing complex configuration YAML files, serializing and persisting YMAL data. Use it to convert the YAML file into a Python dictionary.
In this case, yaml. dump will write the produced YAML document into the file. Otherwise, yaml. dump returns the produced document.
Constructors and Representers From a high-level, a constructor allows you to take a YAML node and return a class instance; a representer allows you to serialize a class instance into a YAML node; and a tag helps PyYaml know which constructor or representer to call!
You can use safe_dump
instead of dump
. Just keep in mind that it won't be able to represent arbitrary Python objects then. Also, when you load
the YAML, you will get a str
object instead of unicode
.
How about this:
def unicode_representer(dumper, uni):
node = yaml.ScalarNode(tag=u'tag:yaml.org,2002:str', value=uni)
return node
yaml.add_representer(unicode, unicode_representer)
This seems to make dumping unicode objects work the same as dumping str objects for me (Python 2.6).
In [72]: yaml.dump(u'abc')
Out[72]: 'abc\n...\n'
In [73]: yaml.dump('abc')
Out[73]: 'abc\n...\n'
In [75]: yaml.dump(['abc'])
Out[75]: '[abc]\n'
In [76]: yaml.dump([u'abc'])
Out[76]: '[abc]\n'
You need a new dumper class that does everything the standard Dumper class does but overrides the representers for str and unicode.
from yaml.dumper import Dumper
from yaml.representer import SafeRepresenter
class KludgeDumper(Dumper):
pass
KludgeDumper.add_representer(str,
SafeRepresenter.represent_str)
KludgeDumper.add_representer(unicode,
SafeRepresenter.represent_unicode)
Which leads to
>>> print yaml.dump([u'abc',u'abc\xe7'],Dumper=KludgeDumper)
[abc, "abc\xE7"]
>>> print yaml.dump([u'abc',u'abc\xe7'],Dumper=KludgeDumper,encoding=None)
[abc, "abc\xE7"]
Granted, I'm still stumped on how to keep this pretty.
>>> print u'abc\xe7'
abcç
And it breaks a later yaml.load()
>>> yy=yaml.load(yaml.dump(['abc','abc\xe7'],Dumper=KludgeDumper,encoding=None))
>>> yy
['abc', 'abc\xe7']
>>> print yy[1]
abc�
>>> print u'abc\xe7'
abcç
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With