Python multiprocessing - Why is using functools.partial slower than default arguments?

Tags:

Consider the following function:

def f(x, dummy=list(range(10000000))):
    return x

If I use multiprocessing.Pool.imap, I get the following timings:

import time
import os
from multiprocessing import Pool

def f(x, dummy=list(range(10000000))):
    return x

start = time.time()
pool = Pool(2)
for x in pool.imap(f, range(10)):
    print("parent process, x=%s, elapsed=%s" % (x, int(time.time() - start)))

parent process, x=0, elapsed=0
parent process, x=1, elapsed=0
parent process, x=2, elapsed=0
parent process, x=3, elapsed=0
parent process, x=4, elapsed=0
parent process, x=5, elapsed=0
parent process, x=6, elapsed=0
parent process, x=7, elapsed=0
parent process, x=8, elapsed=0
parent process, x=9, elapsed=0

Now if I use functools.partial instead of using a default value:

import time
import os
from multiprocessing import Pool
from functools import partial

def f(x, dummy):
    return x

start = time.time()
g = partial(f, dummy=list(range(10000000)))
pool = Pool(2)
for x in pool.imap(g, range(10)):
    print("parent process, x=%s, elapsed=%s" % (x, int(time.time() - start)))

parent process, x=0, elapsed=1
parent process, x=1, elapsed=2
parent process, x=2, elapsed=5
parent process, x=3, elapsed=7
parent process, x=4, elapsed=8
parent process, x=5, elapsed=9
parent process, x=6, elapsed=10
parent process, x=7, elapsed=10
parent process, x=8, elapsed=11
parent process, x=9, elapsed=11

Why is the version using functools.partial so much slower?

588

asked Jan 28 '16 12:01

usual me

1 Answers

Using multiprocessing requires sending the worker processes information about the function to run, not just the arguments to pass. That information is transferred by pickling that information in the main process, sending it to the worker process, and unpickling it there.

This leads to the primary issue:

Pickling a function with default arguments is cheap; it only pickles the name of the function (plus the info to let Python know it's a function); the worker processes just look up the local copy of the name. They already have a named function f to find, so it costs almost nothing to pass it.

But pickling a partial function involves pickling the underlying function (cheap) and all the default arguments (expensive when the default argument is a 10M long list). So every time a task is dispatched in the partial case, it's pickling the bound argument, sending it to the worker process, the worker process unpickles, then finally does the "real" work. On my machine, that pickle is roughly 50 MB in size, which is a huge amount of overhead; in quick timing tests on my machine, pickling and unpickling a 10 million long list of 0 takes about 620 ms (and that's ignoring the overhead of actually transferring the 50 MB of data).

partials have to pickle this way, because they don't know their own names; when pickling a function like f, f (being def-ed) knows its qualified name (in an interactive interpreter or from the main module of a program, it's __main__.f), so the remote side can just recreate it locally by doing the equivalent of from __main__ import f. But the partial doesn't know its name; sure, you assigned it to g, but neither pickle nor the partial itself know it available with the qualified name __main__.g; it could be named foo.fred or a million other things. So it has to pickle the info necessary to recreate it entirely from scratch. It's also pickle-ing for each call (not just once per worker) because it doesn't know that the callable isn't changing in the parent between work items, and it's always trying to ensure it sends up to date state.

You have other issues (timing creation of the list only in the partial case and the minor overhead of calling a partial wrapped function vs. calling the function directly), but those are chump change relative to the per-call overhead pickling and unpickling the partial is adding (the initial creation of the list is adding one-time overhead of a little under half what each pickle/unpickle cycle costs; the overhead to call through the partial is less than a microsecond).

187

answered Sep 18 '22 15:09

ShadowRanger

Related questions
                            
                                Django's CachedStaticFilesStorage not hashing file urls
                            
                                Using python mock to count number of method calls
                            
                                Load Python 2 .npy file in Python 3
                            
                                Starting the ipython notebook
                            
                                "The owner of this website has banned your access based on your browser's signature" ... on a url request in a python program
                            
                                How to extract schema for avro file in python
                            
                                Counting relationships in SQLAlchemy
                            
                                How to Find Documents That are in the same Cluster with KMeans
                            
                                name 'get_config' is not defined
                            
                                how to close pandas dataframe plot
                            
                                Pylint warning: Possible unbalanced tuple unpacking with sequence
                            
                                How do chained comparisons in Python actually work?
                            
                                Why use re.match(), when re.search() can do the same thing?
                            
                                Get row numbers of rows matching a condition in numpy
                            
                                Python win32gui SetAsForegroundWindow function not working properly
                            
                                How to programmatically count the number of files in an archive using python
                            
                                Data type of pandas column changes to object when it's passed to a function via apply?
                            
                                How to select a list of rows by name in Pandas dataframe
                            
                                How to correctly use auto_created attribute in django?
                            
                                Is there a chain calling method in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python multiprocessing - Why is using functools.partial slower than default arguments?

Tags:

python

python-3.x

functools

python-multiprocessing

usual me

People also ask

1 Answers

ShadowRanger

Recent Activity

Donate For Us