Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple levels of 'collection.defaultdict' in Python

People also ask

What is collections Defaultdict in Python?

Defaultdict is a container like dictionaries present in the module collections. Defaultdict is a sub-class of the dictionary class that returns a dictionary-like object. The functionality of both dictionaries and defaultdict are almost same except for the fact that defaultdict never raises a KeyError.

What is the difference between Defaultdict and dict?

The main difference between defaultdict and dict is that when you try to access or modify a key that's not present in the dictionary, a default value is automatically given to that key . In order to provide this functionality, the Python defaultdict type does two things: It overrides .

Is Defaultdict slower than dict?

defaultdict is faster for larger data sets with more homogenous key sets (ie, how short the dict is after adding elements);


Use:

from collections import defaultdict
d = defaultdict(lambda: defaultdict(int))

This will create a new defaultdict(int) whenever a new key is accessed in d.


Another way to make a pickleable, nested defaultdict is to use a partial object instead of a lambda:

from functools import partial
...
d = defaultdict(partial(defaultdict, int))

This will work because the defaultdict class is globally accessible at the module level:

"You can't pickle a partial object unless the function [or in this case, class] it wraps is globally accessible ... under its __name__ (within its __module__)" -- Pickling wrapped partial functions


Look at nosklo's answer here for a more general solution.

class AutoVivification(dict):
    """Implementation of perl's autovivification feature."""
    def __getitem__(self, item):
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value

Testing:

a = AutoVivification()

a[1][2][3] = 4
a[1][3][3] = 5
a[1][2]['test'] = 6

print a

Output:

{1: {2: {'test': 6, 3: 4}, 3: {3: 5}}}

As per @rschwieb's request for D['key'] += 1, we can expand on previous by overriding addition by defining __add__ method, to make this behave more like a collections.Counter()

First __missing__ will be called to create a new empty value, which will be passed into __add__. We test the value, counting on empty values to be False.

See emulating numeric types for more information on overriding.

from numbers import Number


class autovivify(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition for numeric types when self is empty """
        if not self and isinstance(x, Number):
            return x
        raise ValueError

    def __sub__(self, x):
        if not self and isinstance(x, Number):
            return -1 * x
        raise ValueError

Examples:

>>> import autovivify
>>> a = autovivify.autovivify()
>>> a
{}
>>> a[2]
{}
>>> a
{2: {}}
>>> a[4] += 1
>>> a[5][3][2] -= 1
>>> a
{2: {}, 4: 1, 5: {3: {2: -1}}}

Rather than checking argument is a Number (very non-python, amirite!) we could just provide a default 0 value and then attempt the operation:

class av2(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition when self is empty """
        if not self:
            return 0 + x
        raise ValueError

    def __sub__(self, x):
        """ override subtraction when self is empty """
        if not self:
            return 0 - x
        raise ValueError

Late to the party, but for arbitrary depth I just found myself doing something like this:

from collections import defaultdict

class DeepDict(defaultdict):
    def __call__(self):
        return DeepDict(self.default_factory)

The trick here is basically to make the DeepDict instance itself a valid factory for constructing missing values. Now we can do things like

dd = DeepDict(DeepDict(list))
dd[1][2].extend([3,4])
sum(dd[1][2])  # 7

ddd = DeepDict(DeepDict(DeepDict(list)))
ddd[1][2][3].extend([4,5])
sum(ddd[1][2][3])  # 9