Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dict inside defaultdict being shared across keys

I have a dictionary inside a defaultdict. I noticed that the dictionary is being shared across keys and therefore it takes the values of the last write. How can I isolate those dictionaries?

>>> from collections import defaultdict
>>> defaults = [('a', 1), ('b', {})]
>>> dd = defaultdict(lambda: dict(defaults))
>>> dd[0]
{'a': 1, 'b': {}}
>>> dd[1]
{'a': 1, 'b': {}}
>>> dd[0]['b']['k'] = 'v'
>>> dd
defaultdict(<function <lambda> at 0x7f4b3688b398>, {0: {'a': 1, 'b': {'k': 'v'}}, 1:{'a': 1, 'b': {'k': 'v'}}})
>>> dd[1]['b']['k'] = 'v2'
>>> dd
defaultdict(<function <lambda> at 0x7f4b3688b398>, {0: {'a': 1, 'b': {'k': 'v2'}}, 1: {'a': 1, 'b': {'k': 'v2'}}})

Notice that v was set to v2 for both dictionaries. Why is that? and how to change this behavior without much performance overhead?

like image 887
Isaac Avatar asked Jul 14 '14 21:07

Isaac


People also ask

How does Defaultdict work Defaultdict forces a dictionary?

A defaultdict can be created by giving its declaration an argument that can have three values; list, set or int. According to the specified data type, the dictionary is created and when any key, that does not exist in the defaultdict is added or accessed, it is assigned a default value as opposed to giving a KeyError .

What is the difference between dict and Defaultdict?

So, you can say that defaultdict is much like an ordinary dictionary. The main difference between defaultdict and dict is that when you try to access or modify a key that's not present in the dictionary, a default value is automatically given to that key .

What are the advantages of using Defaultdict over dict?

defaultdict is faster for larger data sets with more homogenous key sets (ie, how short the dict is after adding elements);

Is Defaultdict faster than dict?

get method and the experiment shows that defaultdict more that two times faster than dict. get method.


1 Answers

When you do dict(defaults) you're not copying the inner dictionary, just making another reference to it. So when you change that dictionary, you're going to see the change everywhere it's referenced.

You need deepcopy here to avoid the problem:

import copy
from collections import defaultdict
defaults = {'a': 1, 'b': {}}
dd = defaultdict(lambda: copy.deepcopy(defaults))

Or you need to not use the same inner mutable objects in successive calls by not repeatedly referencing defaults:

dd = defaultdict(lambda: {'a': 1, 'b': {}})
like image 194
agf Avatar answered Oct 05 '22 10:10

agf