While answering the question Clunky calculation of differences between an incrementing set of numbers, is there a more beautiful way?, I came up with two solutions, one with List Comprehension
and other using itertools.starmap.
To me, list comprehension
Syntax looks more lucid, readable, less verbose and more Pythonic. But still as starmap is well available in itertools, I was wondering, there has to be a reason for it.
My Question is whenstarmap
could be preferred over List Comprehension
?
Note If its a matter of Style then it definitely contradicts There should be one-- and preferably only one --obvious way to do it.
Head to Head Comparison
Readability counts. --- LC
Its again a matter of perception but to me LC
is more readable than starmap
.
To use starmap
, either you need to import operator
, or define lambda
or some explicit multi-variable
function and nevertheless extra import from itertools
.
Performance --- LC
>>> def using_star_map(nums):
delta=starmap(sub,izip(nums[1:],nums))
return sum(delta)/float(len(nums)-1)
>>> def using_LC(nums):
delta=(x-y for x,y in izip(nums[1:],nums))
return sum(delta)/float(len(nums)-1)
>>> nums=[random.randint(1,10) for _ in range(100000)]
>>> t1=Timer(stmt='using_star_map(nums)',setup='from __main__ import nums,using_star_map;from itertools import starmap,izip')
>>> t2=Timer(stmt='using_LC(nums)',setup='from __main__ import nums,using_LC;from itertools import izip')
>>> print "%.2f usec/pass" % (1000000 * t1.timeit(number=1000)/100000)
235.03 usec/pass
>>> print "%.2f usec/pass" % (1000000 * t2.timeit(number=1000)/100000)
181.87 usec/pass
Use list comprehension if it's custom function, use list (map ()) if there is builtin function Always use map ()! I ran a quick test comparing three methods for invoking the method of an object. The time difference, in this case, is negligible and is a matter of the function in question (see @Alex Martelli's response ).
The only thing we can tell seems to be that, oddly, while we expect list comprehensions [...] to perform better than generator expressions (...), map is ALSO more efficient that generator expressions (again assuming that all values are evaluated/used).
The starmap () considers each element of the iterable within another iterable as a separate item. It is similar to map (). This function comes under the category terminating iterators.
For this specific test case, [*map (DummyNum.add, vals)] would be faster (because DummyNum.add (x) and x.add () have basically the same performance). By the way, explicit list () calls are slightly slower than list comprehensions.
The difference I normally see is map()
/starmap()
are most appropriate where you are literally just calling a function on every item in a list. In this case, they are a little clearer:
(f(x) for x in y)
map(f, y) # itertools.imap(f, y) in 2.x
(f(*x) for x in y)
starmap(f, y)
As soon as you start needing to throw in lambda
or filter
as well, you should switch up to the list comp/generator expression, but in cases where it's a single function, the syntax feels very verbose for a generator expression of list comprehension.
They are interchangeable, and where in doubt, stick to the generator expression as it's more readable in general, but in a simple case (map(int, strings)
, starmap(Vector, points)
) using map()
/starmap()
can sometimes make things easier to read.
An example where I think starmap()
is more readable:
from collections import namedtuple
from itertools import starmap
points = [(10, 20), (20, 10), (0, 0), (20, 20)]
Vector = namedtuple("Vector", ["x", "y"])
for vector in (Vector(*point) for point in points):
...
for vector in starmap(Vector, points):
...
And for map()
:
values = ["10", "20", "0"]
for number in (int(x) for x in values):
...
for number in map(int, values):
...
python -m timeit -s "from itertools import starmap" -s "from operator import sub" -s "numbers = zip(range(100000), range(100000))" "sum(starmap(sub, numbers))"
1000000 loops, best of 3: 0.258 usec per loop
python -m timeit -s "numbers = zip(range(100000), range(100000))" "sum(x-y for x, y in numbers)"
1000000 loops, best of 3: 0.446 usec per loop
For constructing a namedtuple
:
python -m timeit -s "from itertools import starmap" -s "from collections import namedtuple" -s "numbers = zip(range(100000), reversed(range(100000)))" -s "Vector = namedtuple('Vector', ['x', 'y'])" "list(starmap(Vector, numbers))"
1000000 loops, best of 3: 0.98 usec per loop
python -m timeit -s "from collections import namedtuple" -s "numbers = zip(range(100000), reversed(range(100000)))" -s "Vector = namedtuple('Vector', ['x', 'y'])" "[Vector(*pos) for pos in numbers]"
1000000 loops, best of 3: 0.375 usec per loop
In my tests, where we are talking about using simple functions (no lambda
), starmap()
is faster than the equivalent generator expression. Naturally, performance should take a back-seat to readability unless it's a proven bottleneck.
Example of how lambda
kills any performance gain, same example as in the first set, but with lambda
instead of operator.sub()
:
python -m timeit -s "from itertools import starmap" -s "numbers = zip(range(100000), range(100000))" "sum(starmap(lambda x, y: x-y, numbers))"
1000000 loops, best of 3: 0.546 usec per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With