Python Data Classes instances also include a string representation method, but its result isn't really sufficient for pretty printing purposes when classes have more than a few fields and/or longer field values.
Basically I'm looking for a way to customize the default dataclasses string representation routine or for a pretty-printer that understands data classes and prints them prettier.
So, it's just a small customization I have in mind: adding a line break after each field while indenting lines after the first one.
For example, instead of
x = InventoryItem('foo', 23)
print(x) # =>
InventoryItem(name='foo', unit_price=23, quantity_on_hand=0)
I want to get a string representation like this:
x = InventoryItem('foo', 23)
print(x) # =>
InventoryItem(
name='foo',
unit_price=23,
quantity_on_hand=0
)
Or similar. Perhaps a pretty-printer could get even fancier, such as aligning the =
assignment characters or something like that.
Of course, it should also work in a recursive fashion, e.g. fields that are also dataclasses should be indented more.
The pprint module in Python is a utility module that you can use to print data structures in a readable, pretty way. It's a part of the standard library that's especially useful for debugging code dealing with API requests, large JSON files, and data in general.
Python introduced the dataclass in version 3.7 (PEP 557). The dataclass allows you to define classes with less code and more functionality out of the box. The following defines a regular Person class with two instance attributes name and age : class Person: def __init__(self, name, age): self.name = name self.age = age.
Even though pretty-printing data classes may seem basic, there are currently no tool perfectly tailored for the task.
Join us and get access to thousands of tutorials and a community of expert Pythonistas. In addition to print (), Python includes a pretty print method. This method is particularly useful for outputting debugging information about objects in a more easily readable format:
Only five rows of the dataframe will be printed in a pretty format. … … … … … … This is how you can set the options temporarily to the current statement context using the option_context () method.
Only five rows of the dataframe will be printed in a pretty format. … … … … … … This is how you can set the options temporarily to the current statement context using the option_context () method. Next, you’ll print the dataframe using the print statement.
The pprint
package supports pretty printing only since version 3.10 (NB: Python 3.10 was released in 2021).
Example:
[ins] In [1]: from dataclasses import dataclass
...:
...: @dataclass
...: class Point:
...: x: int
...: y: int
...:
...: @dataclass
...: class Coords:
...: my_points: list
...: my_dict: dict
...:
...: coords = Coords([Point(1, 2), Point(3, 4)], {'a': (1, 2), (1, 2): 'a'})
[ins] In [15]: pprint.pprint(coords, width=20)
Coords(my_points=[Point(x=1,
y=2),
Point(x=3,
y=4)],
my_dict={'a': (1,
2),
(1, 2): 'a'})
When using Python 3.9 or older, there is the prettyprinter package that supports dataclasses and provides some nice pretty-printing features.
Example:
[ins] In [1]: from dataclasses import dataclass
...:
...: @dataclass
...: class Point:
...: x: int
...: y: int
...:
...: @dataclass
...: class Coords:
...: my_points: list
...: my_dict: dict
...:
...: coords = Coords([Point(1, 2), Point(3, 4)], {'a': (1, 2), (1, 2): 'a'})
[nav] In [2]: import prettyprinter as pp
[ins] In [3]: pp.pprint(coords)
Coords(my_points=[Point(x=1, y=2), Point(x=3, y=4)], my_dict={'a': (1, 2), (1, 2): 'a'})
The dataclasses support isn't enabled, by default, thus:
[nav] In [4]: pp.install_extras()
[ins] In [5]: pp.pprint(coords)
Coords(
my_points=[Point(x=1, y=2), Point(x=3, y=4)],
my_dict={'a': (1, 2), (1, 2): 'a'}
)
Or to force indenting of all fields:
[ins] In [6]: pp.pprint(coords, width=1)
Coords(
my_points=[
Point(
x=1,
y=2
),
Point(
x=3,
y=4
)
],
my_dict={
'a': (
1,
2
),
(
1,
2
): 'a'
}
)
Prettyprinter can even syntax-highlight! (cf. cpprint()
)
Considerations:
prettyprinter
isn't part of the python standard libraryprettyprinter
is pretty-printing very slowly, i.e. much slower than the standard pprint
, e.g. for checking if a value is a default value, it's compared against a default-constructed valuePython 3.10+ Supports pretty printing dataclasses:
Python 3.10.0b2+ (heads/3.10:f807a4fad4, Sep 4 2021, 18:58:04) [GCC 11.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from dataclasses import dataclass
>>> @dataclass
... class Literal:
... value: 'Any'
...
>>> @dataclass
... class Binary:
... left: 'Binary | Literal'
... operator: str
... right: 'Binary | Literal'
...
>>> from pprint import pprint
>>> # magic happens here
>>> pprint(
... Binary(Binary(Literal(2), '*', Literal(100)), '+', Literal(50)))
Binary(left=Binary(left=Literal(value=2),
operator='*',
right=Literal(value=100)),
operator='+',
right=Literal(value=50))
We can use dataclasses.fields
to recurse through nested dataclasses and pretty print them:
from collections.abc import Mapping, Iterable
from dataclasses import is_dataclass, fields
def pretty_print(obj, indent=4):
"""
Pretty prints a (possibly deeply-nested) dataclass.
Each new block will be indented by `indent` spaces (default is 4).
"""
print(stringify(obj, indent))
def stringify(obj, indent=4, _indents=0):
if isinstance(obj, str):
return f"'{obj}'"
if not is_dataclass(obj) and not isinstance(obj, (Mapping, Iterable)):
return str(obj)
this_indent = indent * _indents * ' '
next_indent = indent * (_indents + 1) * ' '
start, end = f'{type(obj).__name__}(', ')' # dicts, lists, and tuples will re-assign this
if is_dataclass(obj):
body = '\n'.join(
f'{next_indent}{field.name}='
f'{stringify(getattr(obj, field.name), indent, _indents + 1)},' for field in fields(obj)
)
elif isinstance(obj, Mapping):
if isinstance(obj, dict):
start, end = '{}'
body = '\n'.join(
f'{next_indent}{stringify(key, indent, _indents + 1)}: '
f'{stringify(value, indent, _indents + 1)},' for key, value in obj.items()
)
else: # is Iterable
if isinstance(obj, list):
start, end = '[]'
elif isinstance(obj, tuple):
start = '('
body = '\n'.join(
f'{next_indent}{stringify(item, indent, _indents + 1)},' for item in obj
)
return f'{start}\n{body}\n{this_indent}{end}'
We can test it with a nested dataclass:
from dataclasses import dataclass
@dataclass
class Point:
x: int
y: int
@dataclass
class Coords:
my_points: list
my_dict: dict
coords = Coords([Point(1, 2), Point(3, 4)], {'a': (1, 2), (1, 2): 'a'})
pretty_print(coords)
# Coords(
# my_points=[
# Point(
# x=1,
# y=2,
# ),
# Point(
# x=3,
# y=4,
# ),
# ],
# my_dict={
# 'a': (
# 1,
# 2,
# ),
# (
# 1,
# 2,
# ): 'a',
# },
# )
This should be general enough to cover most cases. Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With