Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the ternary operator faster then .get for dicts?

I discovered something surprising recently. Given a dict d that does not contain the key k, using the ternary operator to attempt to retrieve an item with a default return:

>>> def tern():
...     d[k] if k in d else 'foo'
...
>>> timeit.timeit(tern, number=1000000)
0.12342095375061035

runs faster than the dict's .get() function:

>>> def get_meth():
...     d.get(k, 'foo')
...
>>> timeit.timeit(get_meth, number=1000000)
0.20549297332763672

This seems counter-intuitive to me. I would think that the ternary operator would require 2 searches through the dict (once to test k in d), then another to retrieve d[k], while .get would simply attempt to retrieve d[k], and if it fails, return 'foo'.

I ran this on both a large dict (one million elements) and a small one (100), and both times, ternary was significantly faster. What's going on behind the scenes here?

like image 885
ewok Avatar asked Oct 18 '22 01:10

ewok


1 Answers

If you disassemble the two methods, you will see that get has an extra CALL_FUNCTION which is expensive in python, compared to a POP_JUMP_IF_FALSE instruction.

if in

  3           0 LOAD_CONST               1 ('blub')
              3 LOAD_GLOBAL              0 (d)
              6 COMPARE_OP               6 (in)
              9 POP_JUMP_IF_FALSE       22
             12 LOAD_GLOBAL              0 (d)
             15 LOAD_CONST               1 ('blub')
             18 BINARY_SUBSCR       
             19 JUMP_FORWARD             3 (to 25)
        >>   22 LOAD_CONST               2 ('foo')
        >>   25 POP_TOP             
             26 LOAD_CONST               0 (None)
             29 RETURN_VALUE        

Get Method:

  6           0 LOAD_GLOBAL              0 (d)
              3 LOAD_ATTR                1 (get)
              6 LOAD_CONST               1 ('blub')
              9 LOAD_CONST               2 ('foo')
             12 CALL_FUNCTION            2          #Expensive call
             15 POP_TOP             
             16 LOAD_CONST               0 (None)
             19 RETURN_VALUE        

There is a very long article I've read a while ago, which has a section that describes why CALL_FUNCTION is so expensive: https://doughellmann.com/blog/2012/11/12/the-performance-impact-of-using-dict-instead-of-in-cpython-2-7-2/

like image 185
user1767754 Avatar answered Oct 21 '22 04:10

user1767754