Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When can a Python object be pickled

I'm doing a fair amount of parallel processing in Python using the multiprocessing module. I know certain objects CAN be pickle (thus passed as arguments in multi-p) and others can't. E.g.

class abc():
    pass

a=abc()
pickle.dumps(a)
'ccopy_reg\n_reconstructor\np1\n(c__main__\nabc\np2\nc__builtin__\nobject\np3\nNtRp4\n.'

But I have some larger classes in my code (a dozen methods, or so), and this happens:

a=myBigClass()
pickle.dumps(a)
Traceback (innermost last):
 File "<stdin>", line 1, in <module>
 File "/usr/apps/Python279/python-2.7.9-rhel5-x86_64/lib/python2.7/copy_reg.py", line 70, in _reduce_ex
raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle file objects

It's not a file object, but at other times, I'll get other messages that say basically: "I can't pickle this".

So what's the rule? Number of bytes? Depth of hierarchy? Phase of the moon?

like image 977
Paul Nelson Avatar asked Apr 28 '15 14:04

Paul Nelson


People also ask

Which objects Cannot be pickled Python?

Generally you can pickle any object if you can pickle every attribute of that object. Classes, functions, and methods cannot be pickled -- if you pickle an object, the object's class is not pickled, just a string that identifies what class it belongs to.

Can you pickle objects?

“Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy.

How do you make an object pickle in Python?

First, import pickle to use it, then we define an example dictionary, which is a Python object. Next, we open a file (note that we open to write bytes in Python 3+), then we use pickle. dump() to put the dict into opened file, then close. Use pickle.

Can Python classes be pickled?

You can pickle a custom python class object and then unpickle it using pickle. dump() and pickle. load(). In this tutorial, we shall go through example programs to learn how to pickle a python class object.


1 Answers

I'm the dill author. There's a fairly comprehensive list of what pickles and what doesn't as part of dill. It can be run per version of Python 2.5–3.4, and adjusted for what pickles with dill or what pickles with pickle by changing one flag. See here and here.

The root of the rules for what pickles is (off the top of my head):

  1. Can you capture the state of the object by reference (i.e. a function defined in __main__ versus an imported function)? [Then, yes]
  2. Does a generic __getstate__ and __setstate__ rule exist for the given object type? [Then, yes]
  3. Does it depend on a Frame object (i.e. rely on the GIL and global execution stack)? Iterators are now an exception to this, by "replaying" the iterator on unpickling. [Then, no]
  4. Does the object instance point to the wrong class path (i.e. due to being defined in a closure, in C-bindings, or other __init__ path manipulations)? [Then, no]
  5. Is it considered dangerous by Python to allow this? [Then, no]

So, (5) is less prevalent now than it used to be, but still has some lasting effects in the language for pickle. dill, for the most part, removes (1), (2), and (5) – but is still fairly effected by (3) and (4).

I might be forgetting something else, but I think in general those are the underlying rules.

Certain modules like multiprocessing register some objects that are important for their functioning. dill registers the majority of objects in the language.

The dill fork of multiprocessing is required because multiprocessing uses cPickle, and dill can only augment the pure-Python pickling registry. You could, if you have the patience, go through all the relevant copy_reg functions in dill, and apply them to the cPickle module and you'd get a much more pickle-capable multiprocessing. I've found a simple (read: one liner) way to do this for pickle, but not cPickle.

like image 61
Mike McKerns Avatar answered Sep 18 '22 20:09

Mike McKerns