I need to execute the code below (simplified version of my real code base in Python 3.5): <pre class="prettyprint"><code>import multiprocessing def forever(do_something=None): while True: do_something() p = multiprocessing.Process(target=forever, args=(lambda: print("do something"),)) p.start() </code></pre> In order to create the new process Python need to pickle the function and the lambda passed as target. Unofrtunately pickle cannot serialize lambdas and the output is like this: <pre class="prettyprint"><code>_pickle.PicklingError: Can't pickle <function <lambda> at 0x00C0D4B0>: attribute lookup <lambda> on __main__ failed </code></pre> I discoverd cloudpickle which can serialize and deserialize lambdas and closures, using the same interface of pickle. How can I force the Python multiprocessing module to use cloudpickle instead of pickle? Clearly hacking the code of the standard lib multiprocessing is not an option! Thanks Charlie

If you're willing to do a little monkeypatching, a quick fix is to sub out the <code>pickle.Pickler</code>: <pre class="prettyprint lang-py prettyprint-override"><code>import pickle import cloudpickle pickle.Pickler = cloudpickle.Pickler </code></pre> or, in more recent versions of Python where <code>_pickle.Pickle</code> is pulled in, <pre class="prettyprint"><code>from multiprocessing import reduction import cloudpickle reduction.ForkingPickler = cloudpickle.Pickler </code></pre> Just make sure to do this before importing <code>multiprocessing</code>. Here's a full example: <pre class="prettyprint lang-py prettyprint-override"><code>import pickle import cloudpickle pickle.Pickler = cloudpickle.Pickler import multiprocessing as mp mp.set_start_method('spawn', True) def procprint(f): print(f()) if __name__ == '__main__': p = mp.Process(target=procprint, args=(lambda: "hello",)) p.start() p.join() </code></pre> As an aside, you won't need to do any of this if your start method is <code>fork</code>, since with forking nothing needs to be pickled in the first place.

Replace pickle in Python multiprocessing lib

Tags:

python

lambda

pickle

python-multiprocessing

I need to execute the code below (simplified version of my real code base in Python 3.5):

import multiprocessing
def forever(do_something=None):
    while True:
        do_something()

p = multiprocessing.Process(target=forever, args=(lambda: print("do  something"),))
p.start()

In order to create the new process Python need to pickle the function and the lambda passed as target. Unofrtunately pickle cannot serialize lambdas and the output is like this:

_pickle.PicklingError: Can't pickle <function <lambda> at 0x00C0D4B0>: attribute lookup <lambda> on __main__ failed

I discoverd cloudpickle which can serialize and deserialize lambdas and closures, using the same interface of pickle.

How can I force the Python multiprocessing module to use cloudpickle instead of pickle?

Clearly hacking the code of the standard lib multiprocessing is not an option!

Thanks

Charlie

343

asked Oct 25 '16 08:10

Charlie

2 Answers

Try multiprocess. It's a fork of multiprocessing that uses the dill serializer instead of pickle -- there are no other changes in the fork.

I'm the author. I encountered the same problem as you several years ago, and ultimately I decided that that hacking the standard library was my only choice, as some of the pickle code in multiprocessing is in C++.

>>> import multiprocess as mp
>>> p = mp.Pool()
>>> p.map(lambda x:x**2, range(4))
[0, 1, 4, 9]
>>>

173

answered Oct 05 '22 16:10

Mike McKerns

If you're willing to do a little monkeypatching, a quick fix is to sub out the pickle.Pickler:

import pickle
import cloudpickle
pickle.Pickler = cloudpickle.Pickler

or, in more recent versions of Python where _pickle.Pickle is pulled in,

from multiprocessing import reduction
import cloudpickle
reduction.ForkingPickler = cloudpickle.Pickler

Just make sure to do this before importing multiprocessing. Here's a full example:

import pickle
import cloudpickle
pickle.Pickler = cloudpickle.Pickler

import multiprocessing as mp
mp.set_start_method('spawn', True)

def procprint(f):
    print(f())

if __name__ == '__main__':
    p = mp.Process(target=procprint, args=(lambda: "hello",))
    p.start()
    p.join()

As an aside, you won't need to do any of this if your start method is fork, since with forking nothing needs to be pickled in the first place.

answered Oct 05 '22 15:10

Andy Jones

Related questions
                            
                                How to filter by NaN in string column in pandas? [duplicate]
                            
                                Add file to tar archive without saving it first
                            
                                Pydub - combine split_on_silence with minimum length / file size
                            
                                Tensorflow reshape tensor
                            
                                Get text with BeautifulSoup CSS Selector
                            
                                IP Spoofing in python 3
                            
                                win32gui.FindWindow Not finding window
                            
                                python partial with keyword arguments
                            
                                Python Pandas : Pivot table : aggfunc concatenate instead of np.size or np.sum
                            
                                Is there a way to prevent dtype from changing from Int64 to float64 when reindexing/upsampling a time-series?
                            
                                Observations meaning - OpenAI Gym
                            
                                Type and default input value of a Click.option in --help option
                            
                                Use module as class instance in Python
                            
                                error when using keras' sk-learn API
                            
                                How to configure ruamel.yaml.dump output?
                            
                                How to use the green "Attach Debugger" button in Python console using PyCharm
                            
                                Using python's multiprocessing on slurm
                            
                                Inherit from scikit-learn's LassoCV model
                            
                                How to format the entries in Gtk.Entry
                            
                                Virtualenv and Pip hanging forever

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Replace pickle in Python multiprocessing lib

Tags:

python

lambda

pickle

python-multiprocessing

Charlie

People also ask

2 Answers

Mike McKerns

Andy Jones

Recent Activity

Donate For Us