Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove all empty fields in a nested dict?

Tags:

python

If I have a dict, which field's values may also be a dict or an array. How can I remove all empty fields in it?

"Empty field" means a field's value is empty array([]), None, or empty dict(all sub-fields are empty).

Example: Input:

{
    "fruit": [
        {"apple": 1},
        {"banana": None}
    ],
    "veg": [],
    "result": {
        "apple": 1,
        "banana": None
    }
}

Output:

{
    "fruit": [
        {"apple": 1}
    ],
    "result": {
        "apple": 1
    }
}
like image 675
Pier Cheng Avatar asked Jan 15 '15 22:01

Pier Cheng


People also ask

How do I remove something from a nested dictionary Python?

To remove an element from a nested dictionary, use the del() method.

Are nested dictionaries bad practice?

There is nothing inherently wrong with nested dicts. Anything can be a dict value, and it can make sense for a dict to be one. A lot of the time when people make nested dicts, their problems could be solved slightly more easily by using a dict with tuples for keys.

How do you iterate through a nested dictionary in Python?

Iterate over all values of a nested dictionary in python For a normal dictionary, we can just call the items() function of dictionary to get an iterable sequence of all key-value pairs.


2 Answers

Use a recursive function that returns a new dictionary:

def clean_empty(d):
    if isinstance(d, dict):
        return {
            k: v 
            for k, v in ((k, clean_empty(v)) for k, v in d.items())
            if v
        }
    if isinstance(d, list):
        return [v for v in map(clean_empty, d) if v]
    return d

The {..} construct is a dictionary comprehension; it'll only include keys from the original dictionary if v is true, e.g. not empty. Similarly the [..] construct builds a list.

The nested (.. for ..) construct is a generator expression that allows the code to compactly filter empty objects after recursing.

Another way of constructing such a function is to use the @singledispatch decorator; you then write multiple functions, one per object type:

from functools import singledispatch

@singledispatch
def clean_empty(obj):
    return obj

@clean_empty.register
def _dicts(d: dict):
    items = ((k, clean_empty(v)) for k, v in d.items())
    return {k: v for k, v in items if v}

@clean_empty.register
def _lists(l: list):
    items = map(clean_empty, l)
    return [v for v in items if v]

The above @singledispatch version does exactly the same thing as the first function but the isinstance() tests are now taken care of by the decorator implementation, based on the type annotations of the registered functions. I also put the nested iterators (the generator expression and map() function) into a separate variable to improve readability further.

Note that any values set to numeric 0 (integer 0, float 0.0) will also be cleared. You can retain numeric 0 values with if v or v == 0.

Demo of the first function:

>>> sample = {
...     "fruit": [
...         {"apple": 1},
...         {"banana": None}
...     ],
...     "veg": [],
...     "result": {
...         "apple": 1,
...         "banana": None
...     }
... }
>>> def clean_empty(d):
...     if isinstance(d, dict):
...         return {
...             k: v
...             for k, v in ((k, clean_empty(v)) for k, v in d.items())
...             if v
...         }
...     if isinstance(d, list):
...         return [v for v in map(clean_empty, d) if v]
...     return d
... 
>>> clean_empty(sample)
{'fruit': [{'apple': 1}], 'result': {'apple': 1}}
like image 73
Martijn Pieters Avatar answered Sep 22 '22 19:09

Martijn Pieters


If you want a full-featured, yet succinct approach to handling real-world data structures which are often nested, and can even contain cycles and other kinds of containers, I recommend looking at the remap utility from the boltons utility package.

After pip install boltons or copying iterutils.py into your project, just do:

from boltons.iterutils import remap

data = {'veg': [], 'fruit': [{'apple': 1}, {'banana': None}], 'result': {'apple': 1, 'banana': None}}

drop_falsey = lambda path, key, value: bool(value)
clean = remap(data, visit=drop_falsey)
print(clean)

# Output:
{'fruit': [{'apple': 1}], 'result': {'apple': 1}}

This page has many more examples, including ones working with much larger objects from Github's API.

It's pure-Python, so it works everywhere, and is fully tested in Python 2.7 and 3.3+. Best of all, I wrote it for exactly cases like this, so if you find a case it doesn't handle, you can bug me to fix it right here.

like image 36
Mahmoud Hashemi Avatar answered Sep 23 '22 19:09

Mahmoud Hashemi