Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best approach in python: multiple OR or IN in if statement?

What is the best approach in python: multiple OR or IN in if statement? Considering performance and best pratices.

if cond == '1' or cond == '2' or cond == '3' or cond == '4': pass

OR

if cond in ['1','2','3','4']: pass
like image 658
Jonathan Simon Prates Avatar asked Jul 12 '13 12:07

Jonathan Simon Prates


2 Answers

The best approach is to use a set:

if cond in {'1','2','3','4'}:

as membership testing in a set is O(1) (constant cost).

The other two approaches are equal in complexity; merely a difference in constant costs. Both the in test on a list and the or chain short-circuit; terminate as soon as a match is found. One uses a sequence of byte-code jumps (jump to the end if True), the other uses a C-loop and an early exit if the value matches. In the worst-case scenario, where cond does not match an element in the sequence either approach has to check all elements before it can return False. Of the two, I'd pick the in test any day because it is far more readable.

like image 124
Martijn Pieters Avatar answered Oct 05 '22 10:10

Martijn Pieters


This actually depends on the version of Python. In Python 2.7 there were no set constants in the bytecode, thus in Python 2 in the case of a fixed constant small set of values use a tuple:

if x in ('2', '3', '5', '7'):
    ...

A tuple is a constant:

>>> dis.dis(lambda: item in ('1','2','3','4'))
  1           0 LOAD_GLOBAL              0 (item)
              3 LOAD_CONST               5 (('1', '2', '3', '4'))
              6 COMPARE_OP               6 (in)
              9 RETURN_VALUE

Python is also smart enough to optimize a constant list on Python 2.7 to a tuple:

>>> dis.dis(lambda: item in ['1','2','3','4'])
  1           0 LOAD_GLOBAL              0 (item)
              3 LOAD_CONST               5 (('1', '2', '3', '4'))
              6 COMPARE_OP               6 (in)
              9 RETURN_VALUE        

But Python 2.7 bytecode (and compiler) lacks support for constant sets:

>>> dis.dis(lambda: item in {'1','2','3','4'})
  1           0 LOAD_GLOBAL              0 (item)
              3 LOAD_CONST               1 ('1')
              6 LOAD_CONST               2 ('2')
              9 LOAD_CONST               3 ('3')
             12 LOAD_CONST               4 ('4')
             15 BUILD_SET                4
             18 COMPARE_OP               6 (in)
             21 RETURN_VALUE        

Which means that the set in if condition needs to be rebuilt for each test.


However in Python 3.4 the bytecode supports set constants; there the code evaluates to:

>>> dis.dis(lambda: item in {'1','2','3','4'})
  1           0 LOAD_GLOBAL              0 (item)
              3 LOAD_CONST               5 (frozenset({'4', '2', '1', '3'}))
              6 COMPARE_OP               6 (in)
              9 RETURN_VALUE

As for the multi-or code, it produces totally hideous bytecode:

>>> dis.dis(lambda: item == '1' or item == '2' or item == '3' or item == '4')
  1           0 LOAD_GLOBAL              0 (item)
              3 LOAD_CONST               1 ('1')
              6 COMPARE_OP               2 (==)
              9 JUMP_IF_TRUE_OR_POP     45
             12 LOAD_GLOBAL              0 (item)
             15 LOAD_CONST               2 ('2')
             18 COMPARE_OP               2 (==)
             21 JUMP_IF_TRUE_OR_POP     45
             24 LOAD_GLOBAL              0 (item)
             27 LOAD_CONST               3 ('3')
             30 COMPARE_OP               2 (==)
             33 JUMP_IF_TRUE_OR_POP     45
             36 LOAD_GLOBAL              0 (item)
             39 LOAD_CONST               4 ('4')
             42 COMPARE_OP               2 (==)
        >>   45 RETURN_VALUE