Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tuple or list when using 'in' in an 'if' clause?

Which approach is better? Using a tuple, like:

if number in (1, 2): 

or a list, like:

if number in [1, 2]: 

Which one is recommended for such uses and why (both logical and performance wise)?

like image 659
linkyndy Avatar asked Aug 18 '14 16:08

linkyndy


People also ask

Should I use tuple or list?

Tuples are more memory efficient than the lists. When it comes to the time efficiency, again tuples have a slight advantage over the lists especially when lookup to a value is considered. If you have data which is not meant to be changed in the first place, you should choose tuple data type over lists.

Can you use in with a tuple?

Python's in operator lets you loop through all the members of a collection(such as a list or a tuple) and check if there's a member in the tuple that's equal to the given item.

Which is faster tuple or list?

Tuple is stored in a single block of memory. Creating a tuple is faster than creating a list. Creating a list is slower because two memory blocks need to be accessed.


1 Answers

The CPython interpreter replaces the second form with the first.

That's because loading the tuple from a constant is one operation, but the list would be 3 operations; load the two integer contents and build a new list object.

Because you are using a list literal that isn't otherwise reachable, it is substituted for a tuple:

>>> import dis >>> dis.dis(compile('number in [1, 2]', '<stdin>', 'eval'))   1           0 LOAD_NAME                0 (number)               3 LOAD_CONST               2 ((1, 2))               6 COMPARE_OP               6 (in)               9 RETURN_VALUE         

Here the second bytecode loads a (1, 2) tuple as a constant, in one step. Compare this to creating a list object not used in a membership test:

>>> dis.dis(compile('[1, 2]', '<stdin>', 'eval'))   1           0 LOAD_CONST               0 (1)               3 LOAD_CONST               1 (2)               6 BUILD_LIST               2               9 RETURN_VALUE         

Here N+1 steps are required for a list object of length N.

This substitution is a CPython-specific peephole optimisation; see the Python/peephole.c source. For other Python implementations then, you want to stick with immutable objects instead.

That said, the best option when using Python 3.2 and up, is to use a set literal:

if number in {1, 2}: 

as the peephole optimiser will replace that with a frozenset() object and membership tests against sets are O(1) constant operations:

>>> dis.dis(compile('number in {1, 2}', '<stdin>', 'eval'))   1           0 LOAD_NAME                0 (number)               3 LOAD_CONST               2 (frozenset({1, 2}))               6 COMPARE_OP               6 (in)               9 RETURN_VALUE 

This optimization was added in Python 3.2 but wasn't backported to Python 2.

As such, the Python 2 optimiser doesn't recognize this option and the cost of building either a set or frozenset from the contents is almost guaranteed to be more costly than using a tuple for the test.

Set membership tests are O(1) and fast; testing against a tuple is O(n) worst case. Although testing against a set has to calculate the hash (higher constant cost, cached for immutable types), the cost for testing against a tuple other than the first element is always going to be higher. So on average, sets are easily faster:

>>> import timeit >>> timeit.timeit('1 in (1, 3, 5)', number=10**7)  # best-case for tuples 0.21154764899984002 >>> timeit.timeit('8 in (1, 3, 5)', number=10**7)  # worst-case for tuples 0.5670104179880582 >>> timeit.timeit('1 in {1, 3, 5}', number=10**7)  # average-case for sets 0.2663505630043801 >>> timeit.timeit('8 in {1, 3, 5}', number=10**7)  # worst-case for sets 0.25939063701662235 
like image 68
Martijn Pieters Avatar answered Oct 02 '22 19:10

Martijn Pieters