When`starmap` could be preferred over `List Comprehension`

Tags:

While answering the question Clunky calculation of differences between an incrementing set of numbers, is there a more beautiful way?, I came up with two solutions, one with List Comprehension and other using itertools.starmap.

To me, list comprehension Syntax looks more lucid, readable, less verbose and more Pythonic. But still as starmap is well available in itertools, I was wondering, there has to be a reason for it.

My Question is whenstarmap could be preferred over List Comprehension?

Note If its a matter of Style then it definitely contradicts There should be one-- and preferably only one --obvious way to do it.

Head to Head Comparison

Readability counts. --- LC

Its again a matter of perception but to me LC is more readable than starmap. To use starmap, either you need to import operator, or define lambda or some explicit multi-variable function and nevertheless extra import from itertools.

Performance --- LC

>>> def using_star_map(nums):
    delta=starmap(sub,izip(nums[1:],nums))
    return sum(delta)/float(len(nums)-1)
>>> def using_LC(nums):
    delta=(x-y for x,y in izip(nums[1:],nums))
    return sum(delta)/float(len(nums)-1)
>>> nums=[random.randint(1,10) for _ in range(100000)]
>>> t1=Timer(stmt='using_star_map(nums)',setup='from __main__ import nums,using_star_map;from itertools import starmap,izip')
>>> t2=Timer(stmt='using_LC(nums)',setup='from __main__ import nums,using_LC;from itertools import izip')
>>> print "%.2f usec/pass" % (1000000 * t1.timeit(number=1000)/100000)
235.03 usec/pass
>>> print "%.2f usec/pass" % (1000000 * t2.timeit(number=1000)/100000)
181.87 usec/pass

673

asked May 04 '12 12:05

Abhijit

1 Answers

The difference I normally see is map()/starmap() are most appropriate where you are literally just calling a function on every item in a list. In this case, they are a little clearer:

(f(x) for x in y)
map(f, y) # itertools.imap(f, y) in 2.x

(f(*x) for x in y)
starmap(f, y)

As soon as you start needing to throw in lambda or filter as well, you should switch up to the list comp/generator expression, but in cases where it's a single function, the syntax feels very verbose for a generator expression of list comprehension.

They are interchangeable, and where in doubt, stick to the generator expression as it's more readable in general, but in a simple case (map(int, strings), starmap(Vector, points)) using map()/starmap() can sometimes make things easier to read.

Example:

An example where I think starmap() is more readable:

from collections import namedtuple
from itertools import starmap

points = [(10, 20), (20, 10), (0, 0), (20, 20)]

Vector = namedtuple("Vector", ["x", "y"])

for vector in (Vector(*point) for point in points):
    ...

for vector in starmap(Vector, points):
    ...

And for map():

values = ["10", "20", "0"]

for number in (int(x) for x in values):
    ...

for number in map(int, values):
    ...

Performance:

python -m timeit -s "from itertools import starmap" -s "from operator import sub" -s "numbers = zip(range(100000), range(100000))" "sum(starmap(sub, numbers))"                         
1000000 loops, best of 3: 0.258 usec per loop

python -m timeit -s "numbers = zip(range(100000), range(100000))" "sum(x-y for x, y in numbers)"                          
1000000 loops, best of 3: 0.446 usec per loop

For constructing a namedtuple:

python -m timeit -s "from itertools import starmap" -s "from collections import namedtuple" -s "numbers = zip(range(100000), reversed(range(100000)))" -s "Vector = namedtuple('Vector', ['x', 'y'])" "list(starmap(Vector, numbers))"
1000000 loops, best of 3: 0.98 usec per loop

python -m timeit -s "from collections import namedtuple" -s "numbers = zip(range(100000), reversed(range(100000)))" -s "Vector = namedtuple('Vector', ['x', 'y'])" "[Vector(*pos) for pos in numbers]"
1000000 loops, best of 3: 0.375 usec per loop

In my tests, where we are talking about using simple functions (no lambda), starmap() is faster than the equivalent generator expression. Naturally, performance should take a back-seat to readability unless it's a proven bottleneck.

Example of how lambda kills any performance gain, same example as in the first set, but with lambda instead of operator.sub():

python -m timeit -s "from itertools import starmap" -s "numbers = zip(range(100000), range(100000))" "sum(starmap(lambda x, y: x-y, numbers))" 
1000000 loops, best of 3: 0.546 usec per loop

137

answered Sep 18 '22 11:09

Gareth Latty

Related questions
                            
                                Predicting Values with k-Means Clustering Algorithm
                            
                                In laymans terms, what does the Python string format "g" actually mean?
                            
                                Django password reset email subject
                            
                                Linear X Logarithmic scale
                            
                                How can I filter the imagefield by filename in django
                            
                                Temporary directory persist across program runs
                            
                                Python Proxy Error With Requests Library
                            
                                How to unpack only some arguments from zip, not all?
                            
                                Why isn't the 'insert' function adding rows using MySQLdb?
                            
                                django-tables2 specify different properties for different rows
                            
                                Invalid Django form
                            
                                Is there a method to get the get parent canvas for axes in matplotlib?
                            
                                Embedded Python 2.7.2 Importing a module from a user-defined directory
                            
                                Is it possible to take an ordered "slice" of a dictionary in Python based on a list of keys?
                            
                                Using JSON keys as attributes in nested JSON
                            
                                Translate a table to a hierarchical dictionary?
                            
                                python beginner - faster way to find and replace in large file?
                            
                                Parsing date and timestamps in Python with time.strptime format
                            
                                python multiprocessing pool Assertion Error in interpreter
                            
                                Using Dictionaries in Python in place of Case/Switch statement

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

When`starmap` could be preferred over `List Comprehension`

Tags:

python

list-comprehension

itertools