Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cPickle - Ignore stuff it can't serialize instead of raising an exception

Tags:

python

pickle

I'm using cPickle to serialize data that's used for logging.

I'd like to be able to throw whatever I want into an object, then serialize it. Usually this is fine with cPickle, but just ran into a problem where one of the objects I wanted to serialize contained a function. This caused cPickle to raise an exception.

I would rather cPickle just skipped over stuff it can't deal with instead of causing the whole process to implode.

What is a good way to make this happen?

like image 889
Chris Dutrow Avatar asked Jan 13 '13 18:01

Chris Dutrow


2 Answers

I'm assuming that you're looking for a best-effort solution and you're okay if the unpickled results don't function properly.

For your particular use case, you may want to register a pickle handler for function objects. Just make it a dummy handler that's good enough for your best-effort purposes. Making a handler for functions is possible, it's rather tricky. To avoid affecting other code that pickles, you'll probably want to deregister the handler when exiting your logging code.

Here's an example (without any deregistration):

import cPickle
import copy_reg
from types import FunctionType

# data to pickle: note that o['x'] is a lambda and they
# aren't natively picklable (at this time)
o = {'x': lambda x: x, 'y': 1}

# shows that o is not natively picklable (because of
# o['x'])
try:
    cPickle.dumps(o)
except TypeError:
    print "not natively picklable"
else:
    print "was pickled natively"

# create a mechanisms to turn unpickable functions int
# stub objects (the string "STUB" in this case)
def stub_pickler(obj):
    return stub_unpickler, ()
def stub_unpickler():
    return "STUB"
copy_reg.pickle(
    FunctionType,
    stub_pickler, stub_unpickler)

# shows that o is now picklable but o['x'] is restored
# to the stub object instead of its original lambda
print cPickle.loads(cPickle.dumps(o))

It prints:

not natively picklable
{'y': 1, 'x': 'STUB'}
like image 146
Mr Fooz Avatar answered Sep 23 '22 06:09

Mr Fooz


Alternatively, try cloudpickle:

>>> import cloudpickle
>>> squared = lambda x: x ** 2
>>> pickled_lambda = cloudpickle.dumps(squared)

>>> import pickle
>>> new_squared = pickle.loads(pickled_lambda)
>>> new_squared(2)
4

we can pickle that

pip install cloudpickle and live your dreams. The same dreams lived by dask, IPython parallel, and PySpark.

like image 43
Kyle Kelley Avatar answered Sep 24 '22 06:09

Kyle Kelley