Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is tuple faster than list in Python?

I've just read in "Dive into Python" that "tuples are faster than lists".

Tuple is immutable, and list is mutable, but I don't quite understand why tuple is faster.

Anyone did a performance test on this?

like image 597
Quan Mai Avatar asked Jul 27 '10 03:07

Quan Mai


People also ask

Why tuple is better than list in Python?

The key difference between tuples and lists is that while tuples are immutable objects, lists are mutable. This means tuples cannot be changed while lists can be modified. Tuples are also more memory efficient than the lists.

Why are tuples more efficient than lists?

Tuples are stored in a single block of memory. Tuples are immutable so, It doesn't require extra space to store new objects. Lists are allocated in two blocks: the fixed one with all the Python object information and a variable sized block for the data. It is the reason creating a tuple is faster than List.

Which is faster list or tuple or set?

This means that lists can be modified whereas tuples cannot be modified, the tuple is faster than the list because of static in nature.


3 Answers

The reported "speed of construction" ratio only holds for constant tuples (ones whose items are expressed by literals). Observe carefully (and repeat on your machine -- you just need to type the commands at a shell/command window!)...:

$ python3.1 -mtimeit -s'x,y,z=1,2,3' '[x,y,z]'
1000000 loops, best of 3: 0.379 usec per loop
$ python3.1 -mtimeit '[1,2,3]'
1000000 loops, best of 3: 0.413 usec per loop

$ python3.1 -mtimeit -s'x,y,z=1,2,3' '(x,y,z)'
10000000 loops, best of 3: 0.174 usec per loop
$ python3.1 -mtimeit '(1,2,3)'
10000000 loops, best of 3: 0.0602 usec per loop

$ python2.6 -mtimeit -s'x,y,z=1,2,3' '[x,y,z]'
1000000 loops, best of 3: 0.352 usec per loop
$ python2.6 -mtimeit '[1,2,3]'
1000000 loops, best of 3: 0.358 usec per loop

$ python2.6 -mtimeit -s'x,y,z=1,2,3' '(x,y,z)'
10000000 loops, best of 3: 0.157 usec per loop
$ python2.6 -mtimeit '(1,2,3)'
10000000 loops, best of 3: 0.0527 usec per loop

I didn't do the measurements on 3.0 because of course I don't have it around -- it's totally obsolete and there is absolutely no reason to keep it around, since 3.1 is superior to it in every way (Python 2.7, if you can upgrade to it, measures as being almost 20% faster than 2.6 in each task -- and 2.6, as you see, is faster than 3.1 -- so, if you care seriously about performance, Python 2.7 is really the only release you should be going for!).

Anyway, the key point here is that, in each Python release, building a list out of constant literals is about the same speed, or slightly slower, than building it out of values referenced by variables; but tuples behave very differently -- building a tuple out of constant literals is typically three times as fast as building it out of values referenced by variables! You may wonder how this can be, right?-)

Answer: a tuple made out of constant literals can easily be identified by the Python compiler as being one, immutable constant literal itself: so it's essentially built just once, when the compiler turns the source into bytecodes, and stashed away in the "constants table" of the relevant function or module. When those bytecodes execute, they just need to recover the pre-built constant tuple -- hey presto!-)

This easy optimization cannot be applied to lists, because a list is a mutable object, so it's crucial that, if the same expression such as [1, 2, 3] executes twice (in a loop -- the timeit module makes the loop on your behalf;-), a fresh new list object is constructed anew each time -- and that construction (like the construction of a tuple when the compiler cannot trivially identify it as a compile-time constant and immutable object) does take a little while.

That being said, tuple construction (when both constructions actually have to occur) still is about twice as fast as list construction -- and that discrepancy can be explained by the tuple's sheer simplicity, which other answers have mentioned repeatedly. But, that simplicity does not account for a speedup of six times or more, as you observe if you only compare the construction of lists and tuples with simple constant literals as their items!_)

like image 122
Alex Martelli Avatar answered Nov 02 '22 16:11

Alex Martelli


Executive Summary

Tuples tend to perform better than lists in almost every category:

1) Tuples can be constant folded.

2) Tuples can be reused instead of copied.

3) Tuples are compact and don't over-allocate.

4) Tuples directly reference their elements.

Tuples can be constant folded

Tuples of constants can be precomputed by Python's peephole optimizer or AST-optimizer. Lists, on the other hand, get built-up from scratch:

    >>> from dis import dis

    >>> dis(compile("(10, 'abc')", '', 'eval'))
      1           0 LOAD_CONST               2 ((10, 'abc'))
                  3 RETURN_VALUE   

    >>> dis(compile("[10, 'abc']", '', 'eval'))
      1           0 LOAD_CONST               0 (10)
                  3 LOAD_CONST               1 ('abc')
                  6 BUILD_LIST               2
                  9 RETURN_VALUE 

Tuples do not need to be copied

Running tuple(some_tuple) returns immediately itself. Since tuples are immutable, they do not have to be copied:

>>> a = (10, 20, 30)
>>> b = tuple(a)
>>> a is b
True

In contrast, list(some_list) requires all the data to be copied to a new list:

>>> a = [10, 20, 30]
>>> b = list(a)
>>> a is b
False

Tuples do not over-allocate

Since a tuple's size is fixed, it can be stored more compactly than lists which need to over-allocate to make append() operations efficient.

This gives tuples a nice space advantage:

>>> import sys
>>> sys.getsizeof(tuple(iter(range(10))))
128
>>> sys.getsizeof(list(iter(range(10))))
200

Here is the comment from Objects/listobject.c that explains what lists are doing:

/* This over-allocates proportional to the list size, making room
 * for additional growth.  The over-allocation is mild, but is
 * enough to give linear-time amortized behavior over a long
 * sequence of appends() in the presence of a poorly-performing
 * system realloc().
 * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
 * Note: new_allocated won't overflow because the largest possible value
 *       is PY_SSIZE_T_MAX * (9 / 8) + 6 which always fits in a size_t.
 */

Tuples refer directly to their elements

References to objects are incorporated directly in a tuple object. In contrast, lists have an extra layer of indirection to an external array of pointers.

This gives tuples a small speed advantage for indexed lookups and unpacking:

$ python3.6 -m timeit -s 'a = (10, 20, 30)' 'a[1]'
10000000 loops, best of 3: 0.0304 usec per loop
$ python3.6 -m timeit -s 'a = [10, 20, 30]' 'a[1]'
10000000 loops, best of 3: 0.0309 usec per loop

$ python3.6 -m timeit -s 'a = (10, 20, 30)' 'x, y, z = a'
10000000 loops, best of 3: 0.0249 usec per loop
$ python3.6 -m timeit -s 'a = [10, 20, 30]' 'x, y, z = a'
10000000 loops, best of 3: 0.0251 usec per loop

Here is how the tuple (10, 20) is stored:

    typedef struct {
        Py_ssize_t ob_refcnt;
        struct _typeobject *ob_type;
        Py_ssize_t ob_size;
        PyObject *ob_item[2];     /* store a pointer to 10 and a pointer to 20 */
    } PyTupleObject;

Here is how the list [10, 20] is stored:

    PyObject arr[2];              /* store a pointer to 10 and a pointer to 20 */

    typedef struct {
        Py_ssize_t ob_refcnt;
        struct _typeobject *ob_type;
        Py_ssize_t ob_size;
        PyObject **ob_item = arr; /* store a pointer to the two-pointer array */
        Py_ssize_t allocated;
    } PyListObject;

Note that the tuple object incorporates the two data pointers directly while the list object has an additional layer of indirection to an external array holding the two data pointers.

like image 41
Raymond Hettinger Avatar answered Nov 02 '22 16:11

Raymond Hettinger


Alex gave a great answer, but I'm going to try to expand on a few things I think worth mentioning. Any performance differences are generally small and implementation specific: so don't bet the farm on them.

In CPython, tuples are stored in a single block of memory, so creating a new tuple involves at worst a single call to allocate memory. Lists are allocated in two blocks: the fixed one with all the Python object information and a variable sized block for the data. That's part of the reason why creating a tuple is faster, but it probably also explains the slight difference in indexing speed as there is one fewer pointer to follow.

There are also optimisations in CPython to reduce memory allocations: de-allocated list objects are saved on a free list so they can be reused, but allocating a non-empty list still requires a memory allocation for the data. Tuples are saved on 20 free lists for different sized tuples so allocating a small tuple will often not require any memory allocation calls at all.

Optimisations like this are helpful in practice, but they may also make it risky to depend too much on the results of 'timeit' and of course are completely different if you move to something like IronPython where memory allocation works quite differently.

like image 28
Duncan Avatar answered Nov 02 '22 16:11

Duncan