Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does collections.defaultdict work?

I've read the examples in python docs, but still can't figure out what this method means. Can somebody help? Here are two examples from the python docs

>>> from collections import defaultdict

>>> s = 'mississippi'
>>> d = defaultdict(int)
>>> for k in s:
...     d[k] += 1
...
>>> d.items()
[('i', 4), ('p', 2), ('s', 4), ('m', 1)]

and

>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> for k, v in s:
...     d[k].append(v)
...
>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]

the parameters int and list are for what?

like image 967
Lanston Avatar asked May 05 '11 15:05

Lanston


People also ask

What is collection Defaultdict?

Defaultdict is a container like dictionaries present in the module collections. Defaultdict is a sub-class of the dictionary class that returns a dictionary-like object. The functionality of both dictionaries and defaultdict are almost same except for the fact that defaultdict never raises a KeyError.

How does Defaultdict work Defaultdict will automatically?

The Python defaultdict type behaves almost exactly like a regular Python dictionary, but if you try to access or modify a missing key, then defaultdict will automatically create the key and generate a default value for it. This makes defaultdict a valuable option for handling missing keys in dictionaries.

What is the difference between Defaultdict and dict?

The difference is that a defaultdict will "default" a value if that key has not been set yet. If you didn't use a defaultdict you'd have to check to see if that key exists, and if it doesn't, set it to what you want.

What are the advantages of using Defaultdict over dict?

defaultdict is faster for larger data sets with more homogenous key sets (ie, how short the dict is after adding elements);


Video Answer


3 Answers

Usually, a Python dictionary throws a KeyError if you try to get an item with a key that is not currently in the dictionary. The defaultdict in contrast will simply create any items that you try to access (provided of course they do not exist yet). To create such a "default" item, it calls the function object that you pass to the constructor (more precisely, it's an arbitrary "callable" object, which includes function and type objects). For the first example, default items are created using int(), which will return the integer object 0. For the second example, default items are created using list(), which returns a new empty list object.

like image 56
Sven Marnach Avatar answered Oct 11 '22 09:10

Sven Marnach


defaultdict means that if a key is not found in the dictionary, then instead of a KeyError being thrown, a new entry is created. The type of this new entry is given by the argument of defaultdict.

For example:

somedict = {}
print(somedict[3]) # KeyError

someddict = defaultdict(int)
print(someddict[3]) # print int(), thus 0
like image 290
orlp Avatar answered Oct 11 '22 10:10

orlp


defaultdict

"The standard dictionary includes the method setdefault() for retrieving a value and establishing a default if the value does not exist. By contrast, defaultdict lets the caller specify the default(value to be returned) up front when the container is initialized."

as defined by Doug Hellmann in The Python Standard Library by Example

How to use defaultdict

Import defaultdict

>>> from collections import defaultdict

Initialize defaultdict

Initialize it by passing

callable as its first argument(mandatory)

>>> d_int = defaultdict(int)
>>> d_list = defaultdict(list)
>>> def foo():
...     return 'default value'
... 
>>> d_foo = defaultdict(foo)
>>> d_int
defaultdict(<type 'int'>, {})
>>> d_list
defaultdict(<type 'list'>, {})
>>> d_foo
defaultdict(<function foo at 0x7f34a0a69578>, {})

**kwargs as its second argument(optional)

>>> d_int = defaultdict(int, a=10, b=12, c=13)
>>> d_int
defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12})

or

>>> kwargs = {'a':10,'b':12,'c':13}
>>> d_int = defaultdict(int, **kwargs)
>>> d_int
defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12})

How does it works

As is a child class of standard dictionary, it can perform all the same functions.

But in case of passing an unknown key it returns the default value instead of error. For ex:

>>> d_int['a']
10
>>> d_int['d']
0
>>> d_int
defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12, 'd': 0})

In case you want to change default value overwrite default_factory:

>>> d_int.default_factory = lambda: 1
>>> d_int['e']
1
>>> d_int
defaultdict(<function <lambda> at 0x7f34a0a91578>, {'a': 10, 'c': 13, 'b': 12, 'e': 1, 'd': 0})

or

>>> def foo():
...     return 2
>>> d_int.default_factory = foo
>>> d_int['f']
2
>>> d_int
defaultdict(<function foo at 0x7f34a0a0a140>, {'a': 10, 'c': 13, 'b': 12, 'e': 1, 'd': 0, 'f': 2})

Examples in the Question

Example 1

As int has been passed as default_factory, any unknown key will return 0 by default.

Now as the string is passed in the loop, it will increase the count of those alphabets in d.

>>> s = 'mississippi'
>>> d = defaultdict(int)
>>> d.default_factory
<type 'int'>
>>> for k in s:
...     d[k] += 1
>>> d.items()
[('i', 4), ('p', 2), ('s', 4), ('m', 1)]
>>> d
defaultdict(<type 'int'>, {'i': 4, 'p': 2, 's': 4, 'm': 1})

Example 2

As a list has been passed as default_factory, any unknown(non-existent) key will return [ ](ie. list) by default.

Now as the list of tuples is passed in the loop, it will append the value in the d[color]

>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> d.default_factory
<type 'list'>
>>> for k, v in s:
...     d[k].append(v)
>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
>>> d
defaultdict(<type 'list'>, {'blue': [2, 4], 'red': [1], 'yellow': [1, 3]})
like image 136
Somendra Joshi Avatar answered Oct 11 '22 08:10

Somendra Joshi