I'd like to pass object state between two Python programs (one is my own code running standalone, one is a Pyramid view), and different namespaces. Somewhat related questions are here or here, but I can't quite follow through with them for my scenario.
My own code defines a global class (i.e. __main__
namespace) of somewhat complexish structure:
# An instance of this is a colorful mess of nested lists and sets and dicts.
class MyClass :
def __init__(self) :
data = set()
more = dict()
...
def do_sth(self) :
...
At some point I pickle an instance of this class:
c = MyClass()
# Fill c with data.
# Pickle and write the MyClass instance within the __main__ namespace.
with open("my_c.pik", "wb") as f :
pickle.dump(c, f, -1)
A hexdump -C my_c.pik
shows that the first couple of bytes contain __main__.MyClass
from which I assume that the class is indeed defined in the global namespace, and that this is somehow a requirement for reading the pickle. Now I'd like to load this pickled MyClass
instance from within a Pyramid view, which I assume is a different namespace:
# In Pyramid (different namespace) read the pickled MyClass instance.
with open("my_c.pik", "rb") as f :
c = pickle.load(f)
But that results in the following error:
File ".../views.py", line 60, in view_handler_bla
c = pickle.load(f)
AttributeError: 'module' object has no attribute 'MyClass'
It seems to me that the MyClass
definition is missing in whatever namespace the view code executes? I had hoped (assumed) that pickling is a somewhat opaque process which allows me to read a blob of data into whichever place I chose. (More on Python's class names and namespaces is here.)
How can I handle this properly? (Ideally without having to import stuff across...) Can I somehow find the current namespace and inject MyClass
(like this answer seems to suggest)?
Poor Solution
It seems to me that if I refrain from defining and using MyClass
and instead fall back to plain built-in datatypes, this wouldn't be a problem. In fact, I could "serialize" the MyClass
object into a sequence of calls that pickle the individual elements of the MyClass
instance:
# 'Manual' serialization of c works, because all elements are built-in types.
pickle.dump(c.data, f, -1)
pickle.dump(c.more, f, -1)
...
This would defeat the purpose of wrapping data into classes though.
Note
Pickling takes care only of the state of a class, not of any functions defined in the scope of the class (e.g. do_sth()
in the above example). That means that loading a MyClass
instance into a different namespace without the proper class definition loads only the instance data; calling a missing function like do_sth()
will cause an AttributeError.
Python pickle module is used for serializing and de-serializing a Python object structure. Any object in Python can be pickled so that it can be saved on disk. What pickle does is that it “serializes” the object first before writing it to file.
Python Pickle load You have to use pickle. load() function to do that. The primary argument of pickle load function is the file object that you get by opening the file in read-binary (rb) mode. Simple!
Pickling Files To use pickle, start by importing it in Python. To pickle this dictionary, you first need to specify the name of the file you will write it to, which is dogs in this case. Note that the file does not have an extension. To open the file for writing, simply use the open() function.
Use dill
instead of pickle
, because dill
by default pickles by serializing the class definition and not by reference.
>>> import dill
>>> class MyClass:
... def __init__(self):
... self.data = set()
... self.more = dict()
... def do_stuff(self):
... return sorted(self.more)
...
>>> c = MyClass()
>>> c.data.add(1)
>>> c.data.add(2)
>>> c.data.add(3)
>>> c.data
set([1, 2, 3])
>>> c.more['1'] = 1
>>> c.more['2'] = 2
>>> c.more['3'] = lambda x:x
>>> def more_stuff(self, x):
... return x+1
...
>>> c.more_stuff = more_stuff
>>>
>>> with open('my_c.pik', "wb") as f:
... dill.dump(c, f)
...
>>>
Shut down the session, and restart in a new session…
Python 2.7.8 (default, Jul 13 2014, 02:29:54)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> with open('my_c.pik', "rb") as f:
... c = dill.load(f)
...
>>> c.data
set([1, 2, 3])
>>> c.more
{'1': 1, '3': <function <lambda> at 0x10473ec80>, '2': 2}
>>> c.do_stuff()
['1', '2', '3']
>>> c.more_stuff(5)
6
Get dill
here: https://github.com/uqfoundation/dill
Solution 1
On pickle.load
, the module __main__
needs to have a function or class called MyClass
. This does not need to be the original class with the original source code. You can put other methods in it. It should work.
class MyClass(object):
pass
with open("my_c.pik", "rb") as f :
c = pickle.load(f)
Solution 2
Use the copyreg
module which is used to register constructors and pickle functions to pickle specific objects. This is the example given by the module for a complex number:
def pickle_complex(c):
return complex, (c.real, c.imag)
copyreg.pickle(complex, pickle_complex, complex)
Solution 3
Override the persistent_id
method of the Pickler and Unpickler. pickler.persistent_id(obj)
shall return an identifier that can be resolved by unpickler.persistent_id(id)
to the object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With