I am wondering how to create forgiving dictionary (one that returns a default value if a KeyError is raised). In the following code example I would get a KeyError; for example <pre class="prettyprint"><code>a = {'one':1,'two':2} print a['three'] </code></pre> In order not to get one I would 1. Have to catch the exception or use get. I would like to not to have to do that with my dictionary...

<pre class="prettyprint"><code>import collections a = collections.defaultdict(lambda: 3) a.update({'one':1,'two':2}) print a['three'] </code></pre> emits <code>3</code> as required. You could also subclass <code>dict</code> yourself and override <code>__missing__</code>, but that doesn't make much sense when the <code>defaultdict</code> behavior (ignoring the exact missing key that's being looked up) suits you so well... Edit ...unless, that is, you're worried about <code>a</code> growing by one entry each time you look up a missing key (which is part of <code>defaultdict</code>'s semantics) and would rather get slower behavior but save some memory. For example, in terms of memory...: <pre class="prettyprint"><code>>>> import sys >>> a = collections.defaultdict(lambda: 'blah') >>> print len(a), sys.getsizeof(a) 0 140 >>> for i in xrange(99): _ = a[i] ... >>> print len(a), sys.getsizeof(a) 99 6284 </code></pre> ...the defaultdict, originally empty, now has the 99 previously-missing keys that we looked up, and takes 6284 bytes (vs. the 140 bytes it took when it was empty). The alternative approach...: <pre class="prettyprint"><code>>>> class mydict(dict): ... def __missing__(self, key): return 3 ... >>> a = mydict() >>> print len(a), sys.getsizeof(a) 0 140 >>> for i in xrange(99): _ = a[i] ... >>> print len(a), sys.getsizeof(a) 0 140 </code></pre> ...entirely saves this memory overhead, as you see. Of course, performance is another issue: <pre class="prettyprint"><code>$ python -mtimeit -s'import collections; a=collections.defaultdict(int); r=xrange(99)' 'for i in r: _=a[i]' 100000 loops, best of 3: 14.9 usec per loop $ python -mtimeit -s'class mydict(dict): > def __missing__(self, key): return 0 > ' -s'a=mydict(); r=xrange(99)' 'for i in r: _=a[i]' 10000 loops, best of 3: 92.9 usec per loop </code></pre> Since <code>defaultdict</code> adds the (previously-missing) key on lookup, it gets much faster when such a key is next looked up, while <code>mydict</code> (which overrides <code>__missing__</code> to avoid that addition) pays the "missing key lookup overhead" every time. Whether you care about either issue (performance vs memory footprint) entirely depends on your specific use case, of course. It is in any case a good idea to be aware of the tradeoff!-)

A forgiving dictionary

Tags:

python

dictionary

defaultdict

dictionary-missing

I am wondering how to create forgiving dictionary (one that returns a default value if a KeyError is raised).

In the following code example I would get a KeyError; for example

a = {'one':1,'two':2}
print a['three']

In order not to get one I would 1. Have to catch the exception or use get.

I would like to not to have to do that with my dictionary...

557

asked Jul 29 '10 00:07

James

1 Answers

import collections
a = collections.defaultdict(lambda: 3)
a.update({'one':1,'two':2})
print a['three']

emits 3 as required. You could also subclass dict yourself and override __missing__, but that doesn't make much sense when the defaultdict behavior (ignoring the exact missing key that's being looked up) suits you so well...

Edit ...unless, that is, you're worried about a growing by one entry each time you look up a missing key (which is part of defaultdict's semantics) and would rather get slower behavior but save some memory. For example, in terms of memory...:

>>> import sys
>>> a = collections.defaultdict(lambda: 'blah')
>>> print len(a), sys.getsizeof(a)
0 140
>>> for i in xrange(99): _ = a[i]
... 
>>> print len(a), sys.getsizeof(a)
99 6284

...the defaultdict, originally empty, now has the 99 previously-missing keys that we looked up, and takes 6284 bytes (vs. the 140 bytes it took when it was empty).

The alternative approach...:

>>> class mydict(dict):
...   def __missing__(self, key): return 3
... 
>>> a = mydict()
>>> print len(a), sys.getsizeof(a)
0 140
>>> for i in xrange(99): _ = a[i]
... 
>>> print len(a), sys.getsizeof(a)
0 140

...entirely saves this memory overhead, as you see. Of course, performance is another issue:

$ python -mtimeit -s'import collections; a=collections.defaultdict(int); r=xrange(99)' 'for i in r: _=a[i]'
100000 loops, best of 3: 14.9 usec per loop

$ python -mtimeit -s'class mydict(dict):
>   def __missing__(self, key): return 0
> ' -s'a=mydict(); r=xrange(99)' 'for i in r: _=a[i]'
10000 loops, best of 3: 92.9 usec per loop

Since defaultdict adds the (previously-missing) key on lookup, it gets much faster when such a key is next looked up, while mydict (which overrides __missing__ to avoid that addition) pays the "missing key lookup overhead" every time.

Whether you care about either issue (performance vs memory footprint) entirely depends on your specific use case, of course. It is in any case a good idea to be aware of the tradeoff!-)

answered Sep 19 '22 23:09

Alex Martelli

Related questions
                            
                                accepting multiple user inputs separated by a space in python and append them to a list
                            
                                Accessing values nested within dictionaries
                            
                                What is the most pythonic way to exclude elements of a list that start with a specific character?
                            
                                how to find the height of a node in binary tree recursively
                            
                                Replace all words from word list with another string in python
                            
                                How to remove all the punctuation in a string? (Python) [duplicate]
                            
                                Pre Commit hook git error
                            
                                Check if any of the list of keys are present in a dictionary [duplicate]
                            
                                Convert decimal to ternary(base3) in python
                            
                                How to serve a static webpage from falcon application?
                            
                                Flask : What exactly is @app [duplicate]
                            
                                xgboost plot importance figure size
                            
                                Selection elements of a list based on another 'True'/'False' list
                            
                                Iterate consecutive elements in a list in Python such that the last element combines with first
                            
                                How I can upgrade my Ubuntu python3.7 to python3.8 latest version?
                            
                                Using base class constructor as factory in Python?
                            
                                python dict.add_by_value(dict_2)?
                            
                                Getting object's parent namespace in python?
                            
                                What is the equivalent of object oriented constructs in python?
                            
                                Why doesn't this division work in Python? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With