Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python pretty print dictionary of lists, abbreviate long lists

I have a dictionary of lists and the lists are quite long. How can I print it in a way that only a few elements of the list show up? Obviously, I can write a custom function for that but is there any built-in way or library that can achieve this? For example when printing large data frames, pandas prints it nicely in a short way.

This example better illustrates what I mean:

obj = {'key_1': ['EG8XYD9FVN',
  'S2WARDCVAO',
  'J00YCU55DP',
  'R07BUIF2F7',
  'VGPS1JD0UM',
  'WL3TWSDP8E',
  'LD8QY7DMJ3',
  'J36U3Z9KOQ',
  'KU2FUGYB2U',
  'JF3RQ315BY'],
 'key_2': ['162LO154PM',
  '3ROAV881V2',
  'I4T79LP18J',
  'WBD36EM6QL',
  'DEIODVQU46',
  'KWSJA5WDKQ',
  'WX9SVRFO0G',
  '6UN63WU64G',
  '3Z89U7XM60',
  '167CYON6YN']}

Desired output: something like this:

{'key_1':
    ['EG8XYD9FVN', 'S2WARDCVAO', '...'],
 'key_2':
    ['162LO154PM', '3ROAV881V2', '...']
}
like image 213
CentAu Avatar asked Jul 22 '16 18:07

CentAu


People also ask

How do you pretty-print dictionary values using which modules and function?

The pprint module provides a capability to “pretty-print” arbitrary Python data structures in a form which can be used as input to the interpreter. If the formatted structures include objects which are not fundamental Python types, the representation may not be loadable.

How do I print a Pprint dictionary?

Dictionaries and lists can be converted to the string with str() . In this case, they are converted to a one-line string without newline, as in the output of print() . You can use pprint. pformat() to get the output of pprint.

What is Pprint Python?

Short for Pretty Printer, pprint is a native Python library that allows you to customize the formatting of your output.


3 Answers

If it weren't for the pretty printing, the reprlib module would be the way to go: Safe, elegant and customizable handling of deeply nested and recursive / self-referencing data structures is what it has been made for.

However, it turns out combining the reprlib and pprint modules isn't trivial, at least I couldn't come up with a clean way without breaking (some) of the pretty printing aspects.

So instead, here's a solution that just subclasses PrettyPrinter to crop / abbreviate lists as necessary:

from pprint import PrettyPrinter


obj = {
    'key_1': [
        'EG8XYD9FVN', 'S2WARDCVAO', 'J00YCU55DP', 'R07BUIF2F7', 'VGPS1JD0UM',
        'WL3TWSDP8E', 'LD8QY7DMJ3', 'J36U3Z9KOQ', 'KU2FUGYB2U', 'JF3RQ315BY',
    ],
    'key_2': [
        '162LO154PM', '3ROAV881V2', 'I4T79LP18J', 'WBD36EM6QL', 'DEIODVQU46',
        'KWSJA5WDKQ', 'WX9SVRFO0G', '6UN63WU64G', '3Z89U7XM60', '167CYON6YN',
    ],
    # Test case to make sure we didn't break handling of recursive structures
    'key_3': [
        '162LO154PM', '3ROAV881V2', [1, 2, ['a', 'b', 'c'], 3, 4, 5, 6, 7],
        'KWSJA5WDKQ', 'WX9SVRFO0G', '6UN63WU64G', '3Z89U7XM60', '167CYON6YN',
    ]
}


class CroppingPrettyPrinter(PrettyPrinter):

    def __init__(self, *args, **kwargs):
        self.maxlist = kwargs.pop('maxlist', 6)
        return PrettyPrinter.__init__(self, *args, **kwargs)

    def _format(self, obj, stream, indent, allowance, context, level):
        if isinstance(obj, list):
            # If object is a list, crop a copy of it according to self.maxlist
            # and append an ellipsis
            if len(obj) > self.maxlist:
                cropped_obj = obj[:self.maxlist] + ['...']
                return PrettyPrinter._format(
                    self, cropped_obj, stream, indent,
                    allowance, context, level)

        # Let the original implementation handle anything else
        # Note: No use of super() because PrettyPrinter is an old-style class
        return PrettyPrinter._format(
            self, obj, stream, indent, allowance, context, level)


p = CroppingPrettyPrinter(maxlist=3)
p.pprint(obj)

Output with maxlist=3:

{'key_1': ['EG8XYD9FVN', 'S2WARDCVAO', 'J00YCU55DP', '...'],
 'key_2': ['162LO154PM',
           '3ROAV881V2',
           [1, 2, ['a', 'b', 'c'], '...'],
           '...']}

Output with maxlist=5 (triggers splitting the lists on separate lines):

{'key_1': ['EG8XYD9FVN',
           'S2WARDCVAO',
           'J00YCU55DP',
           'R07BUIF2F7',
           'VGPS1JD0UM',
           '...'],
 'key_2': ['162LO154PM',
           '3ROAV881V2',
           'I4T79LP18J',
           'WBD36EM6QL',
           'DEIODVQU46',
           '...'],
 'key_3': ['162LO154PM',
           '3ROAV881V2',
           [1, 2, ['a', 'b', 'c'], 3, 4, '...'],
           'KWSJA5WDKQ',
           'WX9SVRFO0G',
           '...']}

Notes:

  • This will create copies of lists. Depending on the size of the data structures, this can be very expensive in terms of memory use.
  • This only deals with the special case of lists. Equivalent behavior would have to be implemented for dicts, tuples, sets, frozensets, ... for this class to be of general use.
like image 53
Lukas Graf Avatar answered Oct 24 '22 05:10

Lukas Graf


You could use IPython.lib.pretty.

from IPython.lib.pretty import pprint

> pprint(obj, max_seq_length=5)
{'key_1': ['EG8XYD9FVN',
  'S2WARDCVAO',
  'J00YCU55DP',
  'R07BUIF2F7',
  'VGPS1JD0UM',
  ...],
 'key_2': ['162LO154PM',
  '3ROAV881V2',
  'I4T79LP18J',
  'WBD36EM6QL',
  'DEIODVQU46',
  ...]}

> pprint(dict(map(lambda i: (i, range(i + 5)), range(100))), max_seq_length=10)
{0: [0, 1, 2, 3, 4],
 1: [0, 1, 2, 3, 4, 5],
 2: [0, 1, 2, 3, 4, 5, 6],
 3: [0, 1, 2, 3, 4, 5, 6, 7],
 4: [0, 1, 2, 3, 4, 5, 6, 7, 8],
 5: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 6: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...],
 7: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...],
 8: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...],
 9: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...],
 ...}

For older versions of IPython, you might exploit RepresentationPrinter:

from IPython.lib.pretty import RepresentationPrinter
import sys

def compact_pprint(obj, max_seq_length=10):
    printer = RepresentationPrinter(sys.stdout)
    printer.max_seq_length = max_seq_length
    printer.pretty(obj)
    printer.flush()
like image 39
Michael Hoff Avatar answered Oct 24 '22 06:10

Michael Hoff


You could use the pprint module:

pprint.pprint(obj)

Would output:

{'key_1': ['EG8XYD9FVN',
           'S2WARDCVAO',
           'J00YCU55DP',
           'R07BUIF2F7',
           'VGPS1JD0UM',
           'WL3TWSDP8E',
           'LD8QY7DMJ3',
           'J36U3Z9KOQ',
           'KU2FUGYB2U',
           'JF3RQ315BY'],
 'key_2': ['162LO154PM',
           '3ROAV881V2',
           'I4T79LP18J',
           'WBD36EM6QL',
           'DEIODVQU46',
           'KWSJA5WDKQ',
           'WX9SVRFO0G',
           '6UN63WU64G',
           '3Z89U7XM60',
           '167CYON6YN']}

And,

pprint.pprint(obj,depth=1)

Would output:

{'key_1': [...], 'key_2': [...]}

And,

pprint.pprint(obj,compact=True)

would output:

{'key_1': ['EG8XYD9FVN', 'S2WARDCVAO', 'J00YCU55DP', 'R07BUIF2F7',
           'VGPS1JD0UM', 'WL3TWSDP8E', 'LD8QY7DMJ3', 'J36U3Z9KOQ',
           'KU2FUGYB2U', 'JF3RQ315BY'],
 'key_2': ['162LO154PM', '3ROAV881V2', 'I4T79LP18J', 'WBD36EM6QL',
           'DEIODVQU46', 'KWSJA5WDKQ', 'WX9SVRFO0G', '6UN63WU64G',
           '3Z89U7XM60', '167CYON6YN']}
like image 7
noteness Avatar answered Oct 24 '22 05:10

noteness