Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When`starmap` could be preferred over `List Comprehension`

While answering the question Clunky calculation of differences between an incrementing set of numbers, is there a more beautiful way?, I came up with two solutions, one with List Comprehension and other using itertools.starmap.

To me, list comprehension Syntax looks more lucid, readable, less verbose and more Pythonic. But still as starmap is well available in itertools, I was wondering, there has to be a reason for it.

My Question is whenstarmap could be preferred over List Comprehension?

Note If its a matter of Style then it definitely contradicts There should be one-- and preferably only one --obvious way to do it.

Head to Head Comparison

Readability counts. --- LC

Its again a matter of perception but to me LC is more readable than starmap. To use starmap, either you need to import operator, or define lambda or some explicit multi-variable function and nevertheless extra import from itertools.

Performance --- LC

>>> def using_star_map(nums):
    delta=starmap(sub,izip(nums[1:],nums))
    return sum(delta)/float(len(nums)-1)
>>> def using_LC(nums):
    delta=(x-y for x,y in izip(nums[1:],nums))
    return sum(delta)/float(len(nums)-1)
>>> nums=[random.randint(1,10) for _ in range(100000)]
>>> t1=Timer(stmt='using_star_map(nums)',setup='from __main__ import nums,using_star_map;from itertools import starmap,izip')
>>> t2=Timer(stmt='using_LC(nums)',setup='from __main__ import nums,using_LC;from itertools import izip')
>>> print "%.2f usec/pass" % (1000000 * t1.timeit(number=1000)/100000)
235.03 usec/pass
>>> print "%.2f usec/pass" % (1000000 * t2.timeit(number=1000)/100000)
181.87 usec/pass
like image 673
Abhijit Avatar asked May 04 '12 12:05

Abhijit


People also ask

When should I use list comprehension vs map()?

Use list comprehension if it's custom function, use list (map ()) if there is builtin function Always use map ()! I ran a quick test comparing three methods for invoking the method of an object. The time difference, in this case, is negligible and is a matter of the function in question (see @Alex Martelli's response ).

Are map comprehensions better than generator expressions?

The only thing we can tell seems to be that, oddly, while we expect list comprehensions [...] to perform better than generator expressions (...), map is ALSO more efficient that generator expressions (again assuming that all values are evaluated/used).

What does Starmap () do in Python?

The starmap () considers each element of the iterable within another iterable as a separate item. It is similar to map (). This function comes under the category terminating iterators.

Which is faster - explicit list() or map(dummynum)?

For this specific test case, [*map (DummyNum.add, vals)] would be faster (because DummyNum.add (x) and x.add () have basically the same performance). By the way, explicit list () calls are slightly slower than list comprehensions.


1 Answers

The difference I normally see is map()/starmap() are most appropriate where you are literally just calling a function on every item in a list. In this case, they are a little clearer:

(f(x) for x in y)
map(f, y) # itertools.imap(f, y) in 2.x

(f(*x) for x in y)
starmap(f, y)

As soon as you start needing to throw in lambda or filter as well, you should switch up to the list comp/generator expression, but in cases where it's a single function, the syntax feels very verbose for a generator expression of list comprehension.

They are interchangeable, and where in doubt, stick to the generator expression as it's more readable in general, but in a simple case (map(int, strings), starmap(Vector, points)) using map()/starmap() can sometimes make things easier to read.

Example:

An example where I think starmap() is more readable:

from collections import namedtuple
from itertools import starmap

points = [(10, 20), (20, 10), (0, 0), (20, 20)]

Vector = namedtuple("Vector", ["x", "y"])

for vector in (Vector(*point) for point in points):
    ...

for vector in starmap(Vector, points):
    ...

And for map():

values = ["10", "20", "0"]

for number in (int(x) for x in values):
    ...

for number in map(int, values):
    ...

Performance:

python -m timeit -s "from itertools import starmap" -s "from operator import sub" -s "numbers = zip(range(100000), range(100000))" "sum(starmap(sub, numbers))"                         
1000000 loops, best of 3: 0.258 usec per loop

python -m timeit -s "numbers = zip(range(100000), range(100000))" "sum(x-y for x, y in numbers)"                          
1000000 loops, best of 3: 0.446 usec per loop

For constructing a namedtuple:

python -m timeit -s "from itertools import starmap" -s "from collections import namedtuple" -s "numbers = zip(range(100000), reversed(range(100000)))" -s "Vector = namedtuple('Vector', ['x', 'y'])" "list(starmap(Vector, numbers))"
1000000 loops, best of 3: 0.98 usec per loop

python -m timeit -s "from collections import namedtuple" -s "numbers = zip(range(100000), reversed(range(100000)))" -s "Vector = namedtuple('Vector', ['x', 'y'])" "[Vector(*pos) for pos in numbers]"
1000000 loops, best of 3: 0.375 usec per loop

In my tests, where we are talking about using simple functions (no lambda), starmap() is faster than the equivalent generator expression. Naturally, performance should take a back-seat to readability unless it's a proven bottleneck.

Example of how lambda kills any performance gain, same example as in the first set, but with lambda instead of operator.sub():

python -m timeit -s "from itertools import starmap" -s "numbers = zip(range(100000), range(100000))" "sum(starmap(lambda x, y: x-y, numbers))" 
1000000 loops, best of 3: 0.546 usec per loop
like image 137
Gareth Latty Avatar answered Sep 18 '22 11:09

Gareth Latty