why is multiprocess Pool slower than a for loop?

Tags:

python-multiprocessing

from multiprocessing import Pool

def op1(data):
    return [data[elem] + 1 for elem in range(len(data))]
data = [[elem for elem in range(20)] for elem in range(500000)]

import time

start_time = time.time()
re = []
for data_ in data:
    re.append(op1(data_))

print('--- %s seconds ---' % (time.time() - start_time))

start_time = time.time()
pool = Pool(processes=4)
data = pool.map(op1, data)

print('--- %s seconds ---' % (time.time() - start_time))

I get a much slower run time with pool than I get with for loop. But isn't pool supposed to be using 4 processors to do the computation in parallel?

233

asked Jul 22 '17 16:07

2 Answers

Short answer: Yes, the operations will usually be done on (a subset of) the available cores. But the communication overhead is large. In your example the workload is too small compared to the overhead.

In case you construct a pool, a number of workers will be constructed. If you then instruct to map given input. The following happens:

the data will be split: every worker gets an approximately fair share;
the data will be communicated to the workers;
every worker will process their share of work;
the result is communicated back to the process; and
the main process groups the results together.

Now splitting, communicating and joining data are all processes that are carried out by the main process. These can not be parallelized. Since the operation is fast (O(n) with input size n), the overhead has the same time complexity.

So complexitywise even if you had millions of cores, it would not make much difference, because communicating the list is probably already more expensive than computing the results.

That's why you should parallelize computationally expensive tasks. Not straightforward tasks. The amount of processing should be large compared to the amount of communicating.

In your example, the work is trivial: you add 1 to all the elements. Serializing however is less trivial: you have to encode the lists you send to the worker.

answered Sep 23 '22 02:09

Willem Van Onsem

There are a couple of potential trouble spots with your code, but primarily it's too simple.

The multiprocessing module works by creating different processes, and communicating among them. For each process created, you have to pay the operating system's process startup cost, as well as the python startup cost. Those costs can be high, or low, but they're non-zero in any case.

Once you pay those startup costs, you then pool.map the worker function across all the processes. Which basically adds 1 to a few numbers. This is not a significant load, as your tests prove.

What's worse, you're using .map() which is implicitly ordered (compare with .imap_unordered()), so there's synchronization going on - leaving even less freedom for the various CPU cores to give you speed.

If there's a problem here, it's a "design of experiment" problem - you haven't created a sufficiently difficult problem for multiprocessing to be able to help you.

answered Sep 26 '22 02:09

aghast

Related questions
                            
                                python - replace the boolean value of a list with the values from two different lists [duplicate]
                            
                                Get Type in Robot Framework
                            
                                Regex split string by last occurrence of pattern
                            
                                Convert array of integers into dictionary of indices
                            
                                Django - authenticate() A user with that username already exists
                            
                                Python - Create a List Starting at a Given Value and End at Given Length
                            
                                insert item to list without insert() or append() Python
                            
                                S3: ExpiredToken error for S3 pre-signed url within expiry period
                            
                                NameError: name 'tree' is not defined
                            
                                "No module named scipy" on Windows
                            
                                TypeError: '<' not supported between instances of 'State' and 'State' PYTHON 3
                            
                                Warning using Scipy with Pandas
                            
                                Least square method in python?
                            
                                How to send OpenCV video footage over ZeroMQ sockets?
                            
                                Pandas how does IndexSlice work
                            
                                How to concatenate/append multiple Spark dataframes column wise in Pyspark?
                            
                                If you multiply the smallest positive floating point value by a non-zero number, are you guaranteed a non-zero result?
                            
                                How to install OpenCV on arch linux
                            
                                Easy way to distinguish between 0 and False in a dataframe with mixed values
                            
                                Why do we need str type? Why not just byte-strings?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

why is multiprocess Pool slower than a for loop?

Tags:

python

python-multiprocessing

StatsNoob

People also ask

2 Answers

Willem Van Onsem

aghast

Recent Activity

Donate For Us