You don't have to specify a default factory (but it's the same if you pass None
explicitly)
>>> from collections import defaultdict
>>> defaultdict()
defaultdict(None, {})
>>> defaultdict(None)
defaultdict(None, {})
Why None
though? Then we get this thing:
>>> dd = defaultdict()
>>> dd[0]
# TypeError: 'NoneType' object is not callable <-- expected behaviour
# KeyError: 0 <-- actual behaviour
It's even explicitly allowed, because if you try to make a default dict from some other object, defaultdict(0)
say, there is a failing check
TypeError: first argument must be callable or None
I thought something like lambda: None
would be a better default factory. Why is the default_factory
optional? I don't understand the use-case.
The Python defaultdict type behaves almost exactly like a regular Python dictionary, but if you try to access or modify a missing key, then defaultdict will automatically create the key and generate a default value for it. This makes defaultdict a valuable option for handling missing keys in dictionaries.
A defaultdict works exactly like a normal dict, but it is initialized with a function (“default factory”) that takes no arguments and provides the default value for a nonexistent key. A defaultdict will never raise a KeyError. Any key that does not exist gets the value returned by the default factory.
DefaultDict ,on append elements, maintain keys sorted in the order of addition [duplicate]
It depends on the data; setdefault is faster and simpler with small data sets; defaultdict is faster for larger data sets with more homogenous key sets (ie, how short the dict is after adding elements);
When Guido van Rossum initially proposed a DefaultDict
it had a default value (unlike the current defaultdict
which uses a callable rather than a value) that was set during construction and was read-only (also unlike defaultdict
).
After some discussion Guidio revised the proposal. Here are the relevant highlights:
Many, many people suggested to use a factory function instead of a default value. This is indeed a much better idea (although slightly more cumbersome for the simplest cases).
...
Let's add a generic missing-key handling method to the dict class, as well as a default_factory slot initialized to
None
.
...
[T]he default implementation is designed so that we can write
d = {} d.default_factory = list
The important thing to note is that the new functionality no longer belongs to a subclass. That means that setting the default_factory
in the constructor would break existing code. So by design setting the default_factory
had to happen after the dict
was created. It's initial value is set to None
and it's now a mutable attribute so that it can be meaningfully overwritten.
After yet more discussion, it was decided that maybe it would be best not to complicate the regular dict
type with a defaultdict
specialization.
Steven Bethard then asked for clarification regarding the constructor:
Should default_factory be an argument to the constructor? The three answers I see:
- "No." I'm not a big fan of this answer. Since the whole point of creating a defaultdict type is to provide a default, requiring two statements (the constructor call and the default_factory assignment) to initialize such a dictionary seems a little inconvenient.
- "Yes and it should be followed by all the normal dict constructor arguments." This is okay, but a few errors, like
defaultdict({1:2})
will pass silently (until you try to use the dict, of course).- "Yes and it should be the only constructor argument." This is my favorite mainly because I think it's simple, and I couldn't think of good examples where I really wanted to do
defaultdict(list, some_dict_or_iterable)
ordefaultdict(list, **some_keyword_args)
. It's also forward compatible if we need to add some of the dict constructor args in later.
Guido van Rossum decided that:
The defaultdict signature takes an optional positional argument which is the default_factory, defaulting to None. The remaining positional and all keyword arguments are passed to the dict constructor. IOW:
d = defaultdict(list, [(1, 2)])
is equivalent to:
d = defaultdict() d.default_factory = list d.update([(1, 2)])
Note that the expanded code mirrors exactly how it worked when Guido was considering altering dict
to provide the defaultdict
behavior.
He also provides some justifications upthread:
Even if the default_factory were passed to the constructor, it still ought to be a writable attribute so it can be introspected and modified. A defaultdict that can't change its default factory after its creation is less useful.
Bengt Richter explains why you might want a mutable default factory:
My guess is that realistically default_factory will be used to make clean code for filling a dict, and then turning the factory off if it's to be passed into unknown contexts. Those contexts can then use old code to do as above, or if worth it can temporarily set a factory to do some work. Tightly coupled code I guess could pass factory-enabled dicts between each other.
My guess is that the design is intentional in order to make a defaultdict
instance act like a normal dict, by default, whilst allowing the behaviour to be dynamically modified by simple attribute access later on.
For example:
>>> d = defaultdict()
>>> d['k'] # hey I'm just a plain old dict ;)
KeyError: 'k'
>>> d.default_factory = list
>>> d['L'] # actually, I'm really a defaultdict(list)
[]
>>> d.default_factory = int # just kidding! I'm a counter
>>> d['i']
0
>>> d
defaultdict(int, {'L': [], 'i': 0})
And we can reset it to something that looks like a vanilla dict (which will again raise KeyError
), by setting the factory back to None
.
I have yet to find a pattern where this could be useful, but this usage wouldn't be possible if it was forced to instantiate default dict with one callable positional argument.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With