Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between dict and collections.defaultdict?

I was checking out Peter Norvig's code on how to write simple spell checkers. At the beginning, he uses this code to insert words into a dictionary.

def train(features):     model = collections.defaultdict(lambda: 1)     for f in features:         model[f] += 1     return model 

What is the difference between a Python dict and the one that was used here? In addition, what is the lambda for? I checked the API documentation here and it says that defaultdict is actually derived from dict but how does one decide which one to use?

like image 981
Legend Avatar asked Jul 05 '11 23:07

Legend


People also ask

What is collection Defaultdict?

Defaultdict is a container like dictionaries present in the module collections. Defaultdict is a sub-class of the dictionary class that returns a dictionary-like object. The functionality of both dictionaries and defaultdict are almost same except for the fact that defaultdict never raises a KeyError.

What are the advantages of using Defaultdict over dict?

defaultdict is faster for larger data sets with more homogenous key sets (ie, how short the dict is after adding elements);

What is Defaultdict in Python?

A defaultdict works exactly like a normal dict, but it is initialized with a function (“default factory”) that takes no arguments and provides the default value for a nonexistent key. A defaultdict will never raise a KeyError. Any key that does not exist gets the value returned by the default factory.

How does default dict work?

First, we define 'a' as a list of tuples to hold the key-value pairs. Next, we pass 'list' to defaultdict(), and store this in 'b'. This tells the interpreter that b will hold a dictionary with values that are list. Then, we traverse on the tuples using names 'I' and 'j' with a for-loop.


2 Answers

The difference is that a defaultdict will "default" a value if that key has not been set yet. If you didn't use a defaultdict you'd have to check to see if that key exists, and if it doesn't, set it to what you want.

The lambda is defining a factory for the default value. That function gets called whenever it needs a default value. You could hypothetically have a more complicated default function.

Help on class defaultdict in module collections:  class defaultdict(__builtin__.dict)  |  defaultdict(default_factory) --> dict with default factory  |    |  The default factory is called without arguments to produce  |  a new value when a key is not present, in __getitem__ only.  |  A defaultdict compares equal to a dict with the same items.  |   

(from help(type(collections.defaultdict())))

{}.setdefault is similar in nature, but takes in a value instead of a factory function. It's used to set the value if it doesn't already exist... which is a bit different, though.

like image 195
Donald Miner Avatar answered Sep 20 '22 21:09

Donald Miner


Courtesy :- https://shirishweb.wordpress.com/2017/05/06/python-defaultdict-versus-dict-get/

Using Normal dict

d={} d['Apple']=50 d['Orange']=20 print(d['Apple']) print(d['Grapes'])# This gives Key Error 

We can avoid this KeyError by using defaulting in normal dict as well, let see how we can do it

d={} d['Apple']=50 d['Orange']=20 print(d['Apple']) print(d.get('Apple')) print(d.get('Grapes',0)) # DEFAULTING 

Using default dict

from collections import defaultdict d = defaultdict(int) ## inside parenthesis we say what should be the default value. d['Apple']=50 d['Orange']=20 print(d['Apple']) print(d['Grapes']) ##→ This gives Will not give error 

Using an user defined function to default the value

from collections import defaultdict def mydefault():         return 0  d = defaultdict(mydefault) d['Apple']=50 d['Orange']=20 print(d['Apple']) print(d['Grapes']) 

Summary

  1. Defaulting in normal dict is on case to case basis and in defaultdict we can provide default in general manner

  2. Efficiency of using defaulting by defaultdict is two time greater than defaulting with normal dict. You can refer below link to know better on this performance testing https://shirishweb.wordpress.com/2017/05/06/python-defaultdict-versus-dict-get/

like image 39
sakeesh Avatar answered Sep 23 '22 21:09

sakeesh