I am wondering how to create forgiving dictionary (one that returns a default value if a KeyError is raised).
In the following code example I would get a KeyError; for example
a = {'one':1,'two':2}
print a['three']
In order not to get one I would 1. Have to catch the exception or use get.
I would like to not to have to do that with my dictionary...
Some common synonyms of forgive are condone, excuse, and pardon. While all these words mean "to exact neither punishment nor redress," forgive implies that one gives up all claim to requital and to resentment or vengeful feelings. could not forgive their rudeness.
verb (used with object), for·gave [fer-geyv], for·giv·en, for·giv·ing. to grant pardon for or remission of (an offense, debt, etc.); absolve.
to stop blaming or being angry with someone for something that person has done, or not punish them for something: I don't think she's ever quite forgiven me for getting her name wrong that time.
import collections
a = collections.defaultdict(lambda: 3)
a.update({'one':1,'two':2})
print a['three']
emits 3
as required. You could also subclass dict
yourself and override __missing__
, but that doesn't make much sense when the defaultdict
behavior (ignoring the exact missing key that's being looked up) suits you so well...
Edit ...unless, that is, you're worried about a
growing by one entry each time you look up a missing key (which is part of defaultdict
's semantics) and would rather get slower behavior but save some memory. For example, in terms of memory...:
>>> import sys
>>> a = collections.defaultdict(lambda: 'blah')
>>> print len(a), sys.getsizeof(a)
0 140
>>> for i in xrange(99): _ = a[i]
...
>>> print len(a), sys.getsizeof(a)
99 6284
...the defaultdict, originally empty, now has the 99 previously-missing keys that we looked up, and takes 6284 bytes (vs. the 140 bytes it took when it was empty).
The alternative approach...:
>>> class mydict(dict):
... def __missing__(self, key): return 3
...
>>> a = mydict()
>>> print len(a), sys.getsizeof(a)
0 140
>>> for i in xrange(99): _ = a[i]
...
>>> print len(a), sys.getsizeof(a)
0 140
...entirely saves this memory overhead, as you see. Of course, performance is another issue:
$ python -mtimeit -s'import collections; a=collections.defaultdict(int); r=xrange(99)' 'for i in r: _=a[i]'
100000 loops, best of 3: 14.9 usec per loop
$ python -mtimeit -s'class mydict(dict):
> def __missing__(self, key): return 0
> ' -s'a=mydict(); r=xrange(99)' 'for i in r: _=a[i]'
10000 loops, best of 3: 92.9 usec per loop
Since defaultdict
adds the (previously-missing) key on lookup, it gets much faster when such a key is next looked up, while mydict
(which overrides __missing__
to avoid that addition) pays the "missing key lookup overhead" every time.
Whether you care about either issue (performance vs memory footprint) entirely depends on your specific use case, of course. It is in any case a good idea to be aware of the tradeoff!-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With