<p>I'd like to be able to dump a dictionary containing long strings that I'd like to have in the block style for readability. For example:</p> <pre class="prettyprint"><code>foo: | this is a block literal bar: > this is a folded block </code></pre> <p>PyYAML supports the loading of documents with this style but I can't seem to find a way to dump documents this way. Am I missing something?</p>

<p><code>pyyaml</code> does support dumping literal or folded blocks.</p> <h3>Using <code>Representer.add_representer</code> </h3> <p>defining types:</p> <pre class="prettyprint"><code>class folded_str(str): pass class literal_str(str): pass class folded_unicode(unicode): pass class literal_unicode(str): pass </code></pre> <p>Then you can define the representers for those types. Please note that while Gary's solution works great for unicode, you may need some more work to get strings to work right (see implementation of represent_str).</p> <pre class="prettyprint"><code>def change_style(style, representer): def new_representer(dumper, data): scalar = representer(dumper, data) scalar.style = style return scalar return new_representer import yaml from yaml.representer import SafeRepresenter # represent_str does handle some corner cases, so use that # instead of calling represent_scalar directly represent_folded_str = change_style('>', SafeRepresenter.represent_str) represent_literal_str = change_style('|', SafeRepresenter.represent_str) represent_folded_unicode = change_style('>', SafeRepresenter.represent_unicode) represent_literal_unicode = change_style('|', SafeRepresenter.represent_unicode) </code></pre> <p>Then you can add those representers to the default dumper:</p> <pre class="prettyprint"><code>yaml.add_representer(folded_str, represent_folded_str) yaml.add_representer(literal_str, represent_literal_str) yaml.add_representer(folded_unicode, represent_folded_unicode) yaml.add_representer(literal_unicode, represent_literal_unicode) </code></pre> <p>... and test it:</p> <pre class="prettyprint"><code>data = { 'foo': literal_str('this is a\nblock literal'), 'bar': folded_unicode('this is a folded block'), } print yaml.dump(data) </code></pre> <p>result:</p> <pre class="prettyprint"><code>bar: >- this is a folded block foo: |- this is a block literal </code></pre> <h3>Using <code>default_style</code> </h3> <p>If you are interested in having all your strings follow a default style, you can also use the <code>default_style</code> keyword argument, e.g:</p> <pre class="prettyprint"><code>>>> data = { 'foo': 'line1\nline2\nline3' } >>> print yaml.dump(data, default_style='|') "foo": |- line1 line2 line3 </code></pre> <p>or for folded literals:</p> <pre class="prettyprint"><code>>>> print yaml.dump(data, default_style='>') "foo": >- line1 line2 line3 </code></pre> <p>or for double-quoted literals:</p> <pre class="prettyprint"><code>>>> print yaml.dump(data, default_style='"') "foo": "line1\nline2\nline3" </code></pre> <h3>Caveats:</h3> <p>Here is an example of something you may not expect:</p> <pre class="prettyprint"><code>data = { 'foo': literal_str('this is a\nblock literal'), 'bar': folded_unicode('this is a folded block'), 'non-printable': literal_unicode('this has a \t tab in it'), 'leading': literal_unicode(' with leading white spaces'), 'trailing': literal_unicode('with trailing white spaces '), } print yaml.dump(data) </code></pre> <p>results in:</p> <pre class="prettyprint"><code>bar: >- this is a folded block foo: |- this is a block literal leading: |2- with leading white spaces non-printable: "this has a \t tab in it" trailing: "with trailing white spaces " </code></pre> <h3>1) non-printable characters</h3> <p>See the YAML spec for escaped characters (Section 5.7):</p> <blockquote> <p>Note that escape sequences are only interpreted in double-quoted scalars. In all other scalar styles, the “\” character has no special meaning and non-printable characters are not available.</p> </blockquote> <p><strong>If you want to preserve non-printable characters</strong> (e.g. TAB), you need to use double-quoted scalars. If you are able to dump a scalar with literal style, and there is a non-printable character (e.g. TAB) in there, your YAML dumper is non-compliant.</p> <p>E.g. <code>pyyaml</code> detects the non-printable character <code>\t</code> and uses the double-quoted style even though a default style is specified:</p> <pre class="prettyprint"><code>>>> data = { 'foo': 'line1\nline2\n\tline3' } >>> print yaml.dump(data, default_style='"') "foo": "line1\nline2\n\tline3" >>> print yaml.dump(data, default_style='>') "foo": "line1\nline2\n\tline3" >>> print yaml.dump(data, default_style='|') "foo": "line1\nline2\n\tline3" </code></pre> <h3>2) leading and trailing white spaces</h3> <p>Another bit of useful information in the spec is:</p> <blockquote> <p>All leading and trailing white space characters are excluded from the content</p> </blockquote> <p>This means that if your string does have leading or trailing white space, these would not be preserved in scalar styles other than double-quoted. As a consequence, <code>pyyaml</code> tries to detect what is in your scalar and may force the double-quoted style.</p>

Any yaml libraries in Python that support dumping of long strings as block literals or folded blocks?

Tags:

python

yaml

pyyaml

I'd like to be able to dump a dictionary containing long strings that I'd like to have in the block style for readability. For example:

foo: |   this is a   block literal bar: >   this is a   folded block

PyYAML supports the loading of documents with this style but I can't seem to find a way to dump documents this way. Am I missing something?

887

asked Jun 21 '11 22:06

guidoism

2 Answers

import yaml  class folded_unicode(unicode): pass class literal_unicode(unicode): pass  def folded_unicode_representer(dumper, data):     return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='>') def literal_unicode_representer(dumper, data):     return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='|')  yaml.add_representer(folded_unicode, folded_unicode_representer) yaml.add_representer(literal_unicode, literal_unicode_representer)  data = {     'literal':literal_unicode(         u'by hjw              ___\n'          '   __              /.-.\\\n'          '  /  )_____________\\\\  Y\n'          ' /_ /=== == === === =\\ _\\_\n'          '( /)=== == === === == Y   \\\n'          ' `-------------------(  o  )\n'          '                      \\___/\n'),     'folded': folded_unicode(         u'It removes all ordinary curses from all equipped items. '         'Heavy or permanent curses are unaffected.\n')}  print yaml.dump(data)

The result:

folded: >   It removes all ordinary curses from all equipped items. Heavy or permanent curses   are unaffected. literal: |   by hjw              ___      __              /.-.\     /  )_____________\\  Y    /_ /=== == === === =\ _\_   ( /)=== == === === == Y   \    `-------------------(  o  )                         \___/

For completeness, one should also have str implementations, but I'm going to be lazy :-)

177

answered Sep 20 '22 10:09

Gary van der Merwe

pyyaml does support dumping literal or folded blocks.

Using `Representer.add_representer`

defining types:

class folded_str(str): pass  class literal_str(str): pass  class folded_unicode(unicode): pass  class literal_unicode(str): pass

Then you can define the representers for those types. Please note that while Gary's solution works great for unicode, you may need some more work to get strings to work right (see implementation of represent_str).

def change_style(style, representer):     def new_representer(dumper, data):         scalar = representer(dumper, data)         scalar.style = style         return scalar     return new_representer  import yaml from yaml.representer import SafeRepresenter  # represent_str does handle some corner cases, so use that # instead of calling represent_scalar directly represent_folded_str = change_style('>', SafeRepresenter.represent_str) represent_literal_str = change_style('|', SafeRepresenter.represent_str) represent_folded_unicode = change_style('>', SafeRepresenter.represent_unicode) represent_literal_unicode = change_style('|', SafeRepresenter.represent_unicode)

Then you can add those representers to the default dumper:

yaml.add_representer(folded_str, represent_folded_str) yaml.add_representer(literal_str, represent_literal_str) yaml.add_representer(folded_unicode, represent_folded_unicode) yaml.add_representer(literal_unicode, represent_literal_unicode)

... and test it:

data = {     'foo': literal_str('this is a\nblock literal'),     'bar': folded_unicode('this is a folded block'), }  print yaml.dump(data)

result:

bar: >-   this is a folded block foo: |-   this is a   block literal

Using `default_style`

If you are interested in having all your strings follow a default style, you can also use the default_style keyword argument, e.g:

>>> data = { 'foo': 'line1\nline2\nline3' } >>> print yaml.dump(data, default_style='|') "foo": |-   line1   line2   line3

or for folded literals:

>>> print yaml.dump(data, default_style='>') "foo": >-   line1    line2    line3

or for double-quoted literals:

>>> print yaml.dump(data, default_style='"') "foo": "line1\nline2\nline3"

Caveats:

Here is an example of something you may not expect:

data = {     'foo': literal_str('this is a\nblock literal'),     'bar': folded_unicode('this is a folded block'),     'non-printable': literal_unicode('this has a \t tab in it'),     'leading': literal_unicode('   with leading white spaces'),     'trailing': literal_unicode('with trailing white spaces  '), } print yaml.dump(data)

results in:

bar: >-   this is a folded block foo: |-   this is a   block literal leading: |2-      with leading white spaces non-printable: "this has a \t tab in it" trailing: "with trailing white spaces  "

1) non-printable characters

See the YAML spec for escaped characters (Section 5.7):

Note that escape sequences are only interpreted in double-quoted scalars. In all other scalar styles, the “\” character has no special meaning and non-printable characters are not available.

If you want to preserve non-printable characters (e.g. TAB), you need to use double-quoted scalars. If you are able to dump a scalar with literal style, and there is a non-printable character (e.g. TAB) in there, your YAML dumper is non-compliant.

E.g. pyyaml detects the non-printable character \t and uses the double-quoted style even though a default style is specified:

>>> data = { 'foo': 'line1\nline2\n\tline3' } >>> print yaml.dump(data, default_style='"') "foo": "line1\nline2\n\tline3"  >>> print yaml.dump(data, default_style='>') "foo": "line1\nline2\n\tline3"  >>> print yaml.dump(data, default_style='|') "foo": "line1\nline2\n\tline3"

2) leading and trailing white spaces

Another bit of useful information in the spec is:

All leading and trailing white space characters are excluded from the content

This means that if your string does have leading or trailing white space, these would not be preserved in scalar styles other than double-quoted. As a consequence, pyyaml tries to detect what is in your scalar and may force the double-quoted style.

answered Sep 22 '22 10:09

dnozay

Related questions
                            
                                Why is matplotlib plotting my circles as ovals?
                            
                                Check if list items contains substrings from another list
                            
                                Celery task schedule (Ensuring a task is only executed one at a time)
                            
                                How do I convert an array to string using the jinja template engine?
                            
                                Scrapy - Silently drop an item
                            
                                Get array elements from index to end
                            
                                Python - Download File Using Requests, Directly to Memory
                            
                                Add n tasks to celery queue and wait for the results
                            
                                How to pass all Python's traffics through a http proxy?
                            
                                Response' object is not subscriptable Python http post request
                            
                                Python subprocess .check_call vs .check_output
                            
                                How to encrypt text with a password in python?
                            
                                Unique values of two columns for pandas dataframe [duplicate]
                            
                                Print Visually Pleasing DataFrames in For Loop in Jupyter Notebook Pandas
                            
                                how to uninstall pyenv(installed by homebrew) on Mac
                            
                                How to check if OS is Vista in Python?
                            
                                A class method which behaves differently when called as an instance method?
                            
                                changing the process name of a python script [duplicate]
                            
                                Using Django settings in templates [duplicate]
                            
                                Detect socket hangup without sending or receiving?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Any yaml libraries in Python that support dumping of long strings as block literals or folded blocks?

Tags:

python

yaml

pyyaml

guidoism

People also ask

2 Answers

Gary van der Merwe

Using `Representer.add_representer`

Using `default_style`

Caveats:

1) non-printable characters

2) leading and trailing white spaces

dnozay

Recent Activity

Donate For Us

Any yaml libraries in Python that support dumping of long strings as block literals or folded blocks?

Tags:

python

yaml

pyyaml

guidoism

People also ask

2 Answers

Gary van der Merwe

Using Representer.add_representer

Using default_style

Caveats:

1) non-printable characters

2) leading and trailing white spaces

dnozay

Related questions

Recent Activity

Donate For Us

Using `Representer.add_representer`

Using `default_style`