I was checking out Peter Norvig's code on how to write simple spell checkers. At the beginning, he uses this code to insert words into a dictionary.
def train(features): model = collections.defaultdict(lambda: 1) for f in features: model[f] += 1 return model
What is the difference between a Python dict and the one that was used here? In addition, what is the lambda
for? I checked the API documentation here and it says that defaultdict is actually derived from dict but how does one decide which one to use?
Defaultdict is a container like dictionaries present in the module collections. Defaultdict is a sub-class of the dictionary class that returns a dictionary-like object. The functionality of both dictionaries and defaultdict are almost same except for the fact that defaultdict never raises a KeyError.
defaultdict is faster for larger data sets with more homogenous key sets (ie, how short the dict is after adding elements);
A defaultdict works exactly like a normal dict, but it is initialized with a function (“default factory”) that takes no arguments and provides the default value for a nonexistent key. A defaultdict will never raise a KeyError. Any key that does not exist gets the value returned by the default factory.
First, we define 'a' as a list of tuples to hold the key-value pairs. Next, we pass 'list' to defaultdict(), and store this in 'b'. This tells the interpreter that b will hold a dictionary with values that are list. Then, we traverse on the tuples using names 'I' and 'j' with a for-loop.
The difference is that a defaultdict
will "default" a value if that key has not been set yet. If you didn't use a defaultdict
you'd have to check to see if that key exists, and if it doesn't, set it to what you want.
The lambda is defining a factory for the default value. That function gets called whenever it needs a default value. You could hypothetically have a more complicated default function.
Help on class defaultdict in module collections: class defaultdict(__builtin__.dict) | defaultdict(default_factory) --> dict with default factory | | The default factory is called without arguments to produce | a new value when a key is not present, in __getitem__ only. | A defaultdict compares equal to a dict with the same items. |
(from help(type(collections.defaultdict()))
)
{}.setdefault
is similar in nature, but takes in a value instead of a factory function. It's used to set the value if it doesn't already exist... which is a bit different, though.
Courtesy :- https://shirishweb.wordpress.com/2017/05/06/python-defaultdict-versus-dict-get/
Using Normal dict
d={} d['Apple']=50 d['Orange']=20 print(d['Apple']) print(d['Grapes'])# This gives Key Error
We can avoid this KeyError by using defaulting in normal dict as well, let see how we can do it
d={} d['Apple']=50 d['Orange']=20 print(d['Apple']) print(d.get('Apple')) print(d.get('Grapes',0)) # DEFAULTING
Using default dict
from collections import defaultdict d = defaultdict(int) ## inside parenthesis we say what should be the default value. d['Apple']=50 d['Orange']=20 print(d['Apple']) print(d['Grapes']) ##→ This gives Will not give error
Using an user defined function to default the value
from collections import defaultdict def mydefault(): return 0 d = defaultdict(mydefault) d['Apple']=50 d['Orange']=20 print(d['Apple']) print(d['Grapes'])
Summary
Defaulting in normal dict is on case to case basis and in defaultdict we can provide default in general manner
Efficiency of using defaulting by defaultdict is two time greater than defaulting with normal dict. You can refer below link to know better on this performance testing https://shirishweb.wordpress.com/2017/05/06/python-defaultdict-versus-dict-get/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With