Python and performance of list comprehensions

Tags:

Suppose you have got a list comprehension in python, like

Values = [ f(x) for x in range( 0, 1000 ) ]

with f being just a function without side effects. So all the entries can be computed independently.

Is Python able to increase the performance of this list comprehension compared with the "obvious" implementation; e.g. by shared-memory-parallelization on multicore CPUs?

862

asked Jun 02 '11 11:06

2 Answers

In Python 3.2 they added concurrent.futures, a nice library for solving problems concurrently. Consider this example:

import math, time
from concurrent import futures

PRIMES = [112272535095293, 112582705942171, 112272535095293, 115280095190773, 115797848077099, 1099726899285419, 112272535095293, 112582705942171, 112272535095293, 115280095190773, 115797848077099, 1099726899285419]

def is_prime(n):
    if n % 2 == 0:
        return False

    sqrt_n = int(math.floor(math.sqrt(n)))
    for i in range(3, sqrt_n + 1, 2):
        if n % i == 0:
            return False
    return True

def bench(f):
    start = time.time()
    f()
    elapsed = time.time() - start
    print("Completed in {} seconds".format(elapsed))

def concurrent():
    with futures.ProcessPoolExecutor() as executor:
        values = list(executor.map(is_prime, PRIMES))

def listcomp():
    values = [is_prime(x) for x in PRIMES]

Results on my quad core:

>>> bench(listcomp)
Completed in 14.463825941085815 seconds
>>> bench(concurrent)
Completed in 3.818351984024048 seconds

answered Sep 26 '22 10:09

No, Python will not magically parallelize this for you. In fact, it can't, since it cannot prove the independence of the entries; that would require a great deal of program inspection/verification, which is impossible to get right in the general case.

If you want quick coarse-grained multicore parallelism, I recommend joblib instead:

from joblib import delayed, Parallel
values = Parallel(n_jobs=NUM_CPUS)(delayed(f)(x) for x in range(1000))

Not only have I witnessed near-linear speedups using this library, it also has the great feature of signals such as the one from Ctrl-C onto its worker processes, which cannot be said of all multiprocess libraries.

Note that joblib doesn't really support shared-memory parallelism: it spawns worker processes, not threads, so it incurs some communication overhead from sending data to workers and results back to the master process.

answered Sep 23 '22 10:09

Fred Foo

Related questions
                            
                                csvreader.fieldnames not recognized as an attribute of a csv reader object in python
                            
                                How remove a program installed with distutils?
                            
                                How can I invoke a thread multiple times in Python?
                            
                                How to display database query results of 100,000 rows or more with HTML?
                            
                                Redirect user in Python + Google App Engine
                            
                                SQLAlchemy declarative one-to-many not defined error
                            
                                Python+sqlite: the LIKE query with wildcards
                            
                                how to display python list in django template
                            
                                How can I load all keys from a dict as local variables, a better aproach?
                            
                                How to read/write binary 16-bit data in Python 2.x?
                            
                                Confusion about Python variable scope
                            
                                Choosing between Django-Apache and Java-Tomcat for a web application [closed]
                            
                                Is there a Python module to "thaw" data frozen using Perl's Storable?
                            
                                How to convert time format into milliseconds and back in Python?
                            
                                Converting a StringIO object to a Django ImageFile
                            
                                Clicking a button automatically in a web browser with python
                            
                                How to unittest command line arguments?
                            
                                Python distutils gcc path
                            
                                How to parse this format (Praat TextGrid)
                            
                                Deciding which exceptions to catch in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python and performance of list comprehensions

Tags:

performance

python

list-comprehension

shuhalo

People also ask

2 Answers

zeekay

Fred Foo

Recent Activity

Donate For Us