Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pretty-print dataclasses prettier

Python Data Classes instances also include a string representation method, but its result isn't really sufficient for pretty printing purposes when classes have more than a few fields and/or longer field values.

Basically I'm looking for a way to customize the default dataclasses string representation routine or for a pretty-printer that understands data classes and prints them prettier.

So, it's just a small customization I have in mind: adding a line break after each field while indenting lines after the first one.

For example, instead of

x = InventoryItem('foo', 23)
print(x) # =>
InventoryItem(name='foo', unit_price=23, quantity_on_hand=0)

I want to get a string representation like this:

x = InventoryItem('foo', 23)
print(x) # =>
InventoryItem(
    name='foo',
    unit_price=23,
    quantity_on_hand=0
)

Or similar. Perhaps a pretty-printer could get even fancier, such as aligning the = assignment characters or something like that.

Of course, it should also work in a recursive fashion, e.g. fields that are also dataclasses should be indented more.

like image 247
maxschlepzig Avatar asked Mar 25 '21 21:03

maxschlepzig


People also ask

What does pretty-print do in Python?

The pprint module in Python is a utility module that you can use to print data structures in a readable, pretty way. It's a part of the standard library that's especially useful for debugging code dealing with API requests, large JSON files, and data in general.

What is @dataclass in Python?

Python introduced the dataclass in version 3.7 (PEP 557). The dataclass allows you to define classes with less code and more functionality out of the box. The following defines a regular Person class with two instance attributes name and age : class Person: def __init__(self, name, age): self.name = name self.age = age.

Is there a good tool for pretty-printing data classes?

Even though pretty-printing data classes may seem basic, there are currently no tool perfectly tailored for the task.

What is the pretty print method in Python?

Join us and get access to thousands of tutorials and a community of expert Pythonistas. In addition to print (), Python includes a pretty print method. This method is particularly useful for outputting debugging information about objects in a more easily readable format:

How many rows of The Dataframe will be printed in Pretty format?

Only five rows of the dataframe will be printed in a pretty format. … … … … … … This is how you can set the options temporarily to the current statement context using the option_context () method.

How do I print a pretty Dataframe in Python?

Only five rows of the dataframe will be printed in a pretty format. … … … … … … This is how you can set the options temporarily to the current statement context using the option_context () method. Next, you’ll print the dataframe using the print statement.


Video Answer


3 Answers

The pprint package supports pretty printing only since version 3.10 (NB: Python 3.10 was released in 2021).

Example:

[ins] In [1]: from dataclasses import dataclass
         ...:
         ...: @dataclass
         ...: class Point:
         ...:     x: int
         ...:     y: int
         ...:
         ...: @dataclass
         ...: class Coords:
         ...:     my_points: list
         ...:     my_dict: dict
         ...:
         ...: coords = Coords([Point(1, 2), Point(3, 4)], {'a': (1, 2), (1, 2): 'a'})

[ins] In [15]: pprint.pprint(coords, width=20)                                  
Coords(my_points=[Point(x=1,
                        y=2),
                  Point(x=3,
                        y=4)],
       my_dict={'a': (1,
                      2),
                (1, 2): 'a'})

When using Python 3.9 or older, there is the prettyprinter package that supports dataclasses and provides some nice pretty-printing features.

Example:

[ins] In [1]: from dataclasses import dataclass
         ...:
         ...: @dataclass
         ...: class Point:
         ...:     x: int
         ...:     y: int
         ...:
         ...: @dataclass
         ...: class Coords:
         ...:     my_points: list
         ...:     my_dict: dict
         ...:
         ...: coords = Coords([Point(1, 2), Point(3, 4)], {'a': (1, 2), (1, 2): 'a'})

[nav] In [2]: import prettyprinter as pp

[ins] In [3]: pp.pprint(coords)
Coords(my_points=[Point(x=1, y=2), Point(x=3, y=4)], my_dict={'a': (1, 2), (1, 2): 'a'})

The dataclasses support isn't enabled, by default, thus:

[nav] In [4]: pp.install_extras()
[ins] In [5]: pp.pprint(coords)
Coords(
    my_points=[Point(x=1, y=2), Point(x=3, y=4)],
    my_dict={'a': (1, 2), (1, 2): 'a'}
)

Or to force indenting of all fields:

[ins] In [6]: pp.pprint(coords, width=1)
Coords(
    my_points=[
        Point(
            x=1,
            y=2
        ),
        Point(
            x=3,
            y=4
        )
    ],
    my_dict={
        'a': (
            1,
            2
        ),
        (
            1,
            2
        ): 'a'
    }
)

Prettyprinter can even syntax-highlight! (cf. cpprint())


Considerations:

  • prettyprinter isn't part of the python standard library
  • default values aren't printed, at all and as of 2021 there is no way around this
  • prettyprinter is pretty-printing very slowly, i.e. much slower than the standard pprint, e.g. for checking if a value is a default value, it's compared against a default-constructed value
like image 151
maxschlepzig Avatar answered Oct 19 '22 19:10

maxschlepzig


Python 3.10+ Supports pretty printing dataclasses:

Python 3.10.0b2+ (heads/3.10:f807a4fad4, Sep  4 2021, 18:58:04) [GCC 11.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from dataclasses import dataclass
>>> @dataclass
... class Literal:
...     value: 'Any'
... 
>>> @dataclass
... class Binary:
...     left: 'Binary | Literal'
...     operator: str
...     right: 'Binary | Literal'
... 
>>> from pprint import pprint
>>> # magic happens here
>>> pprint(
... Binary(Binary(Literal(2), '*', Literal(100)), '+', Literal(50)))
Binary(left=Binary(left=Literal(value=2),
                   operator='*',
                   right=Literal(value=100)),
       operator='+',
       right=Literal(value=50))
like image 36
Drdilyor Avatar answered Oct 19 '22 19:10

Drdilyor


We can use dataclasses.fields to recurse through nested dataclasses and pretty print them:

from collections.abc import Mapping, Iterable
from dataclasses import is_dataclass, fields

def pretty_print(obj, indent=4):
    """
    Pretty prints a (possibly deeply-nested) dataclass.
    Each new block will be indented by `indent` spaces (default is 4).
    """
    print(stringify(obj, indent))

def stringify(obj, indent=4, _indents=0):
    if isinstance(obj, str):
        return f"'{obj}'"

    if not is_dataclass(obj) and not isinstance(obj, (Mapping, Iterable)):
        return str(obj)

    this_indent = indent * _indents * ' '
    next_indent = indent * (_indents + 1) * ' '
    start, end = f'{type(obj).__name__}(', ')'  # dicts, lists, and tuples will re-assign this

    if is_dataclass(obj):
        body = '\n'.join(
            f'{next_indent}{field.name}='
            f'{stringify(getattr(obj, field.name), indent, _indents + 1)},' for field in fields(obj)
        )

    elif isinstance(obj, Mapping):
        if isinstance(obj, dict):
            start, end = '{}'

        body = '\n'.join(
            f'{next_indent}{stringify(key, indent, _indents + 1)}: '
            f'{stringify(value, indent, _indents + 1)},' for key, value in obj.items()
        )

    else:  # is Iterable
        if isinstance(obj, list):
            start, end = '[]'
        elif isinstance(obj, tuple):
            start = '('

        body = '\n'.join(
            f'{next_indent}{stringify(item, indent, _indents + 1)},' for item in obj
        )

    return f'{start}\n{body}\n{this_indent}{end}'

We can test it with a nested dataclass:

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

@dataclass
class Coords:
    my_points: list
    my_dict: dict

coords = Coords([Point(1, 2), Point(3, 4)], {'a': (1, 2), (1, 2): 'a'})

pretty_print(coords)

# Coords(
#     my_points=[
#         Point(
#             x=1,
#             y=2,
#         ),
#         Point(
#             x=3,
#             y=4,
#         ),
#     ],
#     my_dict={
#         'a': (
#             1,
#             2,
#         ),
#         (
#             1,
#             2,
#         ): 'a',
#     },
# )

This should be general enough to cover most cases. Hope this helps!

like image 4
salt-die Avatar answered Oct 19 '22 17:10

salt-die