Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Tuple comprehensions" and the star splat/unpack operator *

I just read the question Why is there no tuple comprehension in Python?

In the comments of the accepted answer, it is stated that there are no true "tuple comprehensions". Instead, our current option is to use a generator expression and pass the resulting generator object to the tuple constructor:

tuple(thing for thing in things)

Alternatively, we can create a list using a list comprehension and then pass the list to the tuple constructor:

tuple([thing for thing in things])

Lastly and to the contrary of the accepted answer, a more recent answer stated that tuple comprehensions are indeed a thing (since Python 3.5) using the following syntax:

*(thing for thing in things),
  • To me, it seems like the second example is also one where a generator object is created first. Is this correct?

  • Is there any difference between these expressions in terms of what goes on behind the scenes? In terms of performance? I assume the first and third could have latency issues while the second could have memory issues (as is discussed in the linked comments).

  • Comparing the first one and the last, which one is more pythonic?

Update:

As expected, the list comprehension is indeed much faster. I don't understand why the first one is faster than the third one however. Any thoughts?

>>> from timeit import timeit

>>> a = 'tuple(i for i in range(10000))'
>>> b = 'tuple([i for i in range(10000)])'
>>> c = '*(i for i in range(10000)),'

>>> print('A:', timeit(a, number=1000000))
>>> print('B:', timeit(b, number=1000000))
>>> print('C:', timeit(c, number=1000000))

A: 438.98362647295824
B: 271.7554752581845
C: 455.59842588083677
like image 463
Lucubrator Avatar asked Nov 08 '22 13:11

Lucubrator


1 Answers

To me, it seems like the second example is also one where a generator object is created first. Is this correct?

Yes, you're correct, checkout the CPython bytecode:

>>> import dis
>>> dis.dis("*(thing for thing in thing),")
  1           0 LOAD_CONST               0 (<code object <genexpr> at 0x7f56e9347ed0, file "<dis>", line 1>)
              2 LOAD_CONST               1 ('<genexpr>')
              4 MAKE_FUNCTION            0
              6 LOAD_NAME                0 (thing)
              8 GET_ITER
             10 CALL_FUNCTION            1
             12 BUILD_TUPLE_UNPACK       1
             14 POP_TOP
             16 LOAD_CONST               2 (None)
             18 RETURN_VALUE

Is there any difference between these expressions in terms of what goes on behind the scenes? In terms of performance? I assume the first and third could have latency issues while the second could have memory issues (as is discussed in the linked comments).

My timings suggest the first 1 is slightly faster, presumably because the unpacking is more expensive via BUILD_TUPLE_UNPACK than the tuple() call:

>>> from timeit import timeit
>>> def f1(): tuple(thing for thing in range(100000))
... 
>>> def f2(): *(thing for thing in range(100000)),
... 
>>> timeit(lambda: f1(), number=100)
0.5535585517063737
>>> timeit(lambda: f2(), number=100)
0.6043887557461858

Comparing the first one and the last, which one is more pythonic?

The first one seems far more readable to me, and also will work across different Python versions.

like image 135
Chris_Rands Avatar answered Nov 15 '22 13:11

Chris_Rands