I am using pyyaml to dump an object to a file. There are several unicode strings in the object. I've done this before, but now it's producing output items like this:
'item': !!python/unicode "some string"
Instead of the desired:
'item': 'some string'
I'm intending to output as utf-8. The current command I use is:
yaml.dump(data,file(suite_out,'w'),encoding='utf-8',indent=4,allow_unicode=True)
In other locations I do the following and it works:
codecs.open(suite_out,"w","utf-8").write(
yaml.dump(suite,indent=4,width=10000)
)
What am I doing wrong?
Python 2.7.3
I tried many combinations and the only one I can find that consistently produces the correct YAML output is:
yaml.safe_dump(data, file(filename,'w'), encoding='utf-8', allow_unicode=True)
Inspired by the accepted answer, that safe_dump
can produce the expected result, I checked the source of python2.7/site-packages/yaml/representer.py
, and found that the Representer
for dump
and safe_dump
are using different represent functions for unicode
.
And the represent function can be overwritten with add_representer
. So you can just get the represent function from the SafeRepresenter
, and register it to be used in dump
.
I have to do this as I have some custom types, so I cannot use safe_dump
.
The code is as following:
def represent_unicode(dumper, data):
return dumper.represent_scalar(u'tag:yaml.org,2002:str', data)
yaml.add_representer(unicode, represent_unicode)
My command to produce the output:
yaml.dump(yml, encoding='utf-8', allow_unicode=True, default_flow_style=False, explicit_start=True)
python version is 2.7.5, PyYMAL is 3.10.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With