I discovered something surprising recently. Given a dict d
that does not contain the key k
, using the ternary operator to attempt to retrieve an item with a default return:
>>> def tern():
... d[k] if k in d else 'foo'
...
>>> timeit.timeit(tern, number=1000000)
0.12342095375061035
runs faster than the dict's .get()
function:
>>> def get_meth():
... d.get(k, 'foo')
...
>>> timeit.timeit(get_meth, number=1000000)
0.20549297332763672
This seems counter-intuitive to me. I would think that the ternary operator would require 2 searches through the dict (once to test k in d
), then another to retrieve d[k]
, while .get
would simply attempt to retrieve d[k]
, and if it fails, return 'foo'
.
I ran this on both a large dict (one million elements) and a small one (100), and both times, ternary was significantly faster. What's going on behind the scenes here?
If you disassemble the two methods, you will see that get
has an extra CALL_FUNCTION
which is expensive in python, compared to a POP_JUMP_IF_FALSE
instruction.
if in
3 0 LOAD_CONST 1 ('blub')
3 LOAD_GLOBAL 0 (d)
6 COMPARE_OP 6 (in)
9 POP_JUMP_IF_FALSE 22
12 LOAD_GLOBAL 0 (d)
15 LOAD_CONST 1 ('blub')
18 BINARY_SUBSCR
19 JUMP_FORWARD 3 (to 25)
>> 22 LOAD_CONST 2 ('foo')
>> 25 POP_TOP
26 LOAD_CONST 0 (None)
29 RETURN_VALUE
Get Method:
6 0 LOAD_GLOBAL 0 (d)
3 LOAD_ATTR 1 (get)
6 LOAD_CONST 1 ('blub')
9 LOAD_CONST 2 ('foo')
12 CALL_FUNCTION 2 #Expensive call
15 POP_TOP
16 LOAD_CONST 0 (None)
19 RETURN_VALUE
There is a very long article I've read a while ago, which has a section that describes why CALL_FUNCTION
is so expensive:
https://doughellmann.com/blog/2012/11/12/the-performance-impact-of-using-dict-instead-of-in-cpython-2-7-2/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With