Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tuple constructor vs list comp

Just say I have a list

a = (3, 2, 9, 4)

And I want to add one to each number and store the result, (I won't need to manipulate the result after), my first thought would be to go:

[x + 1 for x in a]

But what about:

tuple(x + 1 for x in a)

Tuples are meant to be faster right? And if I don't need to change the result after is this code more efficient? Also how does it really work, does the tuple constructor have to create a list out of the generator expression to know the size in advance? Thanks in advance for any explanation.

like image 908
197 Avatar asked Apr 12 '13 14:04

197


People also ask

Is a tuple better than a list?

List and Tuple in Python are the classes of Python Data Structures. The list is dynamic, whereas the tuple has static characteristics. This means that lists can be modified whereas tuples cannot be modified, the tuple is faster than the list because of static in nature.

Does list comprehension work with tuples?

This is the power of list comprehension. It can identify when it receives a string or a tuple and work on it like a list. You can do that using loops. However, not every loop can be rewritten as list comprehension.

Do tuples have worse performance than lists?

Tuples tend to perform better than lists in almost every category: Tuples can be constant folded. Tuples can be reused instead of copied. Tuples are compact and don't over-allocate.

What is a constructor in relation to lists and tuples?

In Python, tuple(iterable) is tuple constructor. It is used to create tuple. It can also be used to convert a sequence like lists and dictionaries into tuple. i) type(sequence) In Python, type(seq) returns the type of sequence.


2 Answers

just timeit():

In : a = (3, 2, 9, 4)

In : f1 = lambda: [x + 1 for x in a]

In : f2 = lambda: tuple(x + 1 for x in a)

In : timeit.timeit(f1)
Out: 0.595026969909668

In : timeit.timeit(f2)
Out: 2.360887050628662

so it seems that the tuple constructor variant takes about four times as long, I guess because list comprehensions are fairly optimized (in cpython).

But let's take a closer look:

In : f3 = lambda: list(x + 1 for x in a)

In : timeit.timeit(f3)
Out: 2.5421998500823975

so this takes about the same time as the tuple construction, which indicates that the performance penalty lies in the generator expression overhead. (we can rule out list / tuple construction, see edit below)

It is even about twice as slow as to map() the list:

In : inc = partial(operator.add,1)

In : f4 = lambda:map(inc, a)

In : timeit.timeit(f4)
Out: 1.2346529960632324

I think this really boils down to (cpython) implementation details, so don't rely on this. Anyway - don't worry about performance, it's just a factor of 2-4, use the method which is best to read.

If you really hit performance bottlenecks, investigate and optimize them after you noticed them. And I bet a factor 4 in a list manipulation will be the least of your problems, then.

Edit: Someone mentioned that the lookup cost of "tuple" could cause the slowdown, but this is not the case:

In : f5 = lambda: tuple([x + 1 for x in a])

In : timeit.timeit(f5)
Out: 0.7900090217590332

So I guess it really is the generator expressions overhead which slows things down.

like image 157
ch3ka Avatar answered Sep 29 '22 19:09

ch3ka


The dis module can give you some idea of how the code executes internally...

dis.dis(lambda a: [x + 1 for x in a]) yields...

  1           0 BUILD_LIST               0
              3 LOAD_FAST                0 (a)
              6 GET_ITER
        >>    7 FOR_ITER                16 (to 26)
             10 STORE_FAST               1 (x)
             13 LOAD_FAST                1 (x)
             16 LOAD_CONST               1 (1)
             19 BINARY_ADD
             20 LIST_APPEND              2
             23 JUMP_ABSOLUTE            7
        >>   26 RETURN_VALUE

...and dis.dis(lambda a: tuple(x + 1 for x in a)) yields...

  1           0 LOAD_GLOBAL              0 (tuple)
              3 LOAD_CONST               1 (<code object <genexpr> at 0x7f62e9eda930, file "<stdin>", line 1>)
              6 MAKE_FUNCTION            0
              9 LOAD_FAST                0 (a)
             12 GET_ITER
             13 CALL_FUNCTION            1
             16 CALL_FUNCTION            1
             19 RETURN_VALUE

...but you may not be able to infer much from that. If you want to know which is faster, check out the timeit module.

like image 37
Aya Avatar answered Sep 29 '22 18:09

Aya