Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: How RECURSIVELY remove None values from a NESTED data structure (lists and dictionaries)?

Here is some nested data, that includes lists, tuples, and dictionaries:

data1 = ( 501, (None, 999), None, (None), 504 )
data2 = { 1:601, 2:None, None:603, 'four':'sixty' }
data3 = OrderedDict( [(None, 401), (12, 402), (13, None), (14, data2)] )
data = [ [None, 22, tuple([None]), (None,None), None], ( (None, 202), {None:301, 32:302, 33:data1}, data3 ) ]

Goal: Remove any keys or values (from "data") that are None. If a list or dictionary contains a value, that is itself a list, tuple, or dictionary, then RECURSE, to remove NESTED Nones.

Desired output:

[[22, (), ()], ((202,), {32: 302, 33: (501, (999,), 504)}, OrderedDict([(12, 402), (14, {'four': 'sixty', 1: 601})]))]

Or more readably, here is formatted output:

StripNones(data)= list:
. [22, (), ()]
. tuple:
. . (202,)
. . {32: 302, 33: (501, (999,), 504)}
. . OrderedDict([(12, 402), (14, {'four': 'sixty', 1: 601})])

I will propose a possible answer, as I have not found an existing solution to this. I appreciate any alternatives, or pointers to pre-existing solutions.

EDIT I forgot to mention that this has to work in Python 2.7. I can't use Python 3 at this time.

Though it IS worth posting Python 3 solutions, for others. So please indicate which python you are answering for.

like image 500
ToolmakerSteve Avatar asked Dec 13 '13 03:12

ToolmakerSteve


People also ask

Can Python dictionary have none value?

Many times, while working with dictionaries, we wish to check for a non-null dictionary, i.e check for None values in given dictionary. This finds application in Machine Learning in which we have to feed data with no none values.

How do I remove nested dictionary?

Deleting elements from a nested dictionary To remove an element from a nested dictionary, use the del() method.


4 Answers

If you can assume that the __init__ methods of the various subclasses have the same signature as the typical base class:

def remove_none(obj):
  if isinstance(obj, (list, tuple, set)):
    return type(obj)(remove_none(x) for x in obj if x is not None)
  elif isinstance(obj, dict):
    return type(obj)((remove_none(k), remove_none(v))
      for k, v in obj.items() if k is not None and v is not None)
  else:
    return obj

from collections import OrderedDict
data1 = ( 501, (None, 999), None, (None), 504 )
data2 = { 1:601, 2:None, None:603, 'four':'sixty' }
data3 = OrderedDict( [(None, 401), (12, 402), (13, None), (14, data2)] )
data = [ [None, 22, tuple([None]), (None,None), None], ( (None, 202), {None:301, 32:302, 33:data1}, data3 ) ]
print remove_none(data)

Note that this won't work with a defaultdict for example since the defaultdict takes and additional argument to __init__. To make it work with defaultdict would require another special case elif (before the one for regular dicts).


Also note that I've actually constructed new objects. I haven't modified the old ones. It would be possible to modify the old objects if you didn't need to support modifying immutable objects like tuple.

like image 60
mgilson Avatar answered Oct 26 '22 14:10

mgilson


If you want a full-featured, yet succinct approach to handling real-world nested data structures like these, and even handle cycles, I recommend looking at the remap utility from the boltons utility package.

After pip install boltons or copying iterutils.py into your project, just do:

from collections import OrderedDict
from boltons.iterutils import remap

data1 = ( 501, (None, 999), None, (None), 504 )
data2 = { 1:601, 2:None, None:603, 'four':'sixty' }
data3 = OrderedDict( [(None, 401), (12, 402), (13, None), (14, data2)] )
data = [ [None, 22, tuple([None]), (None,None), None], ( (None, 202), {None:301, 32:302, 33:data1}, data3 ) ]

drop_none = lambda path, key, value: key is not None and value is not None

cleaned = remap(data, visit=drop_none)

print(cleaned)

# got:
[[22, (), ()], ((202,), {32: 302, 33: (501, (999,), 504)}, OrderedDict([(12, 402), (14, {'four': 'sixty', 1: 601})]))]

This page has many more examples, including ones working with much larger objects (from Github's API).

It's pure-Python, so it works everywhere, and is fully tested in Python 2.7 and 3.3+. Best of all, I wrote it for exactly cases like this, so if you find a case it doesn't handle, you can bug me to fix it right here.

like image 26
Mahmoud Hashemi Avatar answered Oct 26 '22 12:10

Mahmoud Hashemi


def stripNone(data):
    if isinstance(data, dict):
        return {k:stripNone(v) for k, v in data.items() if k is not None and v is not None}
    elif isinstance(data, list):
        return [stripNone(item) for item in data if item is not None]
    elif isinstance(data, tuple):
        return tuple(stripNone(item) for item in data if item is not None)
    elif isinstance(data, set):
        return {stripNone(item) for item in data if item is not None}
    else:
        return data

Sample Runs:

print stripNone(data1)
print stripNone(data2)
print stripNone(data3)
print stripNone(data)

(501, (999,), 504)
{'four': 'sixty', 1: 601}
{12: 402, 14: {'four': 'sixty', 1: 601}}
[[22, (), ()], ((202,), {32: 302, 33: (501, (999,), 504)}, {12: 402, 14: {'four': 'sixty', 1: 601}})]
like image 43
thefourtheye Avatar answered Oct 26 '22 13:10

thefourtheye


def purify(o):
    if hasattr(o, 'items'):
        oo = type(o)()
        for k in o:
            if k != None and o[k] != None:
                oo[k] = purify(o[k])
    elif hasattr(o, '__iter__'):
        oo = [ ] 
        for it in o:
            if it != None:
                oo.append(purify(it))
    else: return o
    return type(o)(oo)

print purify(data)

Gives:

[[22, (), ()], ((202,), {32: 302, 33: (501, (999,), 504)}, OrderedDict([(12, 402), (14, {'four': 'sixty', 1: 601})]))]
like image 35
perreal Avatar answered Oct 26 '22 12:10

perreal