Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using lambda and defaultdict

I was reading about the collection defaultdict and came across these lines of code:

import collections
tree = lambda: collections.defaultdict(tree)
some_dict = tree()
some_dict['colours']['favourite'] = "yellow"

I understand that lamba takes a variable and performs some function on it. I've seen lambda being used like this: lambda x: x + 3 In the second line of code above, what variable is lambda taking and what function is it carrying out?

I also understand that defaultdict can take parameters such as int or list. In the second line, defaultdict takes the parameter tree which is a variable. What is the significance of that?

like image 376
Anya Avatar asked Jul 11 '18 13:07

Anya


People also ask

What is Defaultdict Lambda?

defaultdict takes a zero-argument callable to its constructor, which is called when the key is not found, as you correctly explained. lambda: 0 will of course always return zero, but the preferred method to do that is defaultdict(int) , which will do the same thing.

When would you use a Defaultdict?

The Python defaultdict type behaves almost exactly like a regular Python dictionary, but if you try to access or modify a missing key, then defaultdict will automatically create the key and generate a default value for it. This makes defaultdict a valuable option for handling missing keys in dictionaries.

What does the Defaultdict () function do?

A defaultdict works exactly like a normal dict, but it is initialized with a function (“default factory”) that takes no arguments and provides the default value for a nonexistent key. A defaultdict will never raise a KeyError. Any key that does not exist gets the value returned by the default factory.

Is Defaultdict slower than dict?

defaultdict is not necessarily slower than a regular dict . The timings there are flawed, as the timings include creating the object. Other than that, there are different types of performance, maintenance ease being one.


2 Answers

The code is roughly equivalent (ignoring metadata introduced by the def statement) to

import collections
def tree():
    return collections.defaultdict(tree)
some_dict = tree()
some_dict['colours']['favourite'] = "yellow"

The lambda expression simply defines a function of zero parameters, and the function is bound to the name tree.

Typically, you only use lambda expressions when you actually want an anonymous function, for example passing it as an argument to a another function, as in

sorted_list = sorted(some_list_of_tuples, key=lambda x: x[0])

It is considered better practice to use a def statement when you really want a named function.


defaultdict takes a callable to be used to produce a default value for a new key. int() returns 0, list() returns an empty list, and tree() returns a new defaultdict; all of them can be used as arguments to defaultdict. The recursive nature of defining tree to return a defaultdict using itself as the default-value generator means you can generate nested dicts to an arbitrary depth; each "leaf" dict is itself another defaultdict.

like image 140
chepner Avatar answered Oct 22 '22 23:10

chepner


In the second line of code above, what variable is lambda taking and what function is it carrying out?

A lambda function is an anonymous (without name) function. So a lambda expression like:

tree = lambda: collections.defaultdict(tree)

is, except for some details (the fact that its __name__ attribute contains the name of the function, and not '<lambda>'), it is equivalent to:

def tree():
    return collectsions.defaultdict(tree)

The difference with a simple exression is thus that we here encode the computation in a function. We can never call it, call it once, or multiple times.

It also allows us to tie a knot. Notice that we pass a reference to the function (lambda expression) in the result. We thus have a function that construct a defaultdict with as factory the function itself. We can thus recursively construct subtrees.

I also understand that defaultdict can take parameters such as int or list. In the second line, defaultdict takes the parameter tree which is a variable. What is the significance of that?

The tree that we pass to the defaultdict is thus a reference to the lambda-expression we construct. It thus means that in case the defaultdict invokes the "factory". We get another defaultdict with as factory again the tree.

If we thus call some_dict['foo']['bar']['qux']. We thus have a defaultdict in a defaultdict in a defaultdict. All these defaultdicts have as factory the tree function. If we later construct extra children, these will again be a defaultdict with tree as constructor.

The list or int case is not special. If you invoke list (like list()), then you construct a new empty list. The same happens with int: if you call int(), you will obtain 0. The fact that this is a reference to a class object is irrelevant: the defaultdict does not take this into account (it does not know what the factory is, it only invokes it with no parameters).

like image 41
Willem Van Onsem Avatar answered Oct 22 '22 23:10

Willem Van Onsem