Let's say I have a class like this:-
class MyClass:
some object here
some other object here
def init(self, some parameters):
do something
def some_other_method(self, param):
something else
def save(self, path):
PICKLE DUMP THIS OBJECT
def load(self, path):
PICKLE LOAD OBJECT
I don't want to pickle load and dump like:
obj = MyClass(param)
pickle.dump(obj, mypath)
But rather like this:
obj.save(mypath)
How can I do this within the class definition?
The pickle module can store things such as data types such as booleans, strings, and byte arrays, lists, dictionaries, functions, and more. Note: The concept of pickling is also known as serialization, marshaling, and flattening. However, the point is always the same—to save an object to a file for later retrieval.
Python Pickle dump dump() function to store the object data to the file.
In python, dumps() method is used to save variables to a pickle file.
You can pass self
instead of obj
. In other words:
def save(self, file_handler):
pickle.dump(self, file_handler)
The self
points to the instance of that class. So what you basically do is calling pickle.dump
and passing the instance to it together with the file_handler
argument.
Let's build a class A
, and try it...
>>> class A(object):
... x = 1
... def __init__(self, y):
... self.y = y
... def showme(self):
... return self.y + self.x
... def save(self):
... return pickle.dump(self)
... def load(self, pik):
... self.__dict__.update(pickle.loads(pik).__dict__)
...
>>> a = A(2)
>>> a.showme()
3
>>> import pickle
>>>
>>> a_ = a.save()
>>> a.y = 5
>>> a.showme()
6
>>> a.load(a_)
>>> a.y
2
>>> a.showme()
3
>>> b = A(9)
>>> b.load(a_)
>>> b.y
2
>>> b.showme()
3
>>> b.x = 4
>>> b.showme()
6
>>> b_ = b.save()
>>> a.load(b_)
>>> a.x
4
>>> a.y
2
>>> a.showme()
6
>>>
However, since you defined the class in __main__
, if you were to start over the python interpreter session… your pickles would be useless as the class would no longer exist. That's because python pickles by reference. However, there's a workaround for that. If you use dill
, you can pickle your classes by serializing the class definition as well. Then classes defined in __main__
will still be available in a new session.
>>> a.showme()
6
>>> import dill as pickle
>>> a.save()
'\x80\x02cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01U\x08TypeTypeq\x02\x85q\x03Rq\x04U\x01Aq\x05h\x01U\nObjectTypeq\x06\x85q\x07Rq\x08\x85q\t}q\n(U\x04loadq\x0bcdill.dill\n_create_function\nq\x0c(cdill.dill\n_unmarshal\nq\rU\xaec\x02\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00C\x00\x00\x00s \x00\x00\x00|\x00\x00j\x00\x00j\x01\x00t\x02\x00j\x03\x00|\x01\x00\x83\x01\x00j\x00\x00\x83\x01\x00\x01d\x00\x00S(\x01\x00\x00\x00N(\x04\x00\x00\x00t\x08\x00\x00\x00__dict__t\x06\x00\x00\x00updatet\x06\x00\x00\x00picklet\x05\x00\x00\x00loads(\x02\x00\x00\x00t\x04\x00\x00\x00selft\x03\x00\x00\x00pik(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x04\x00\x00\x00load\t\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x0e\x85q\x0fRq\x10c__builtin__\n__main__\nh\x0bNN}q\x11tq\x12Rq\x13U\r__slotnames__q\x14]q\x15U\n__module__q\x16U\x08__main__q\x17U\x06showmeq\x18h\x0c(h\rUuc\x01\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x0e\x00\x00\x00|\x00\x00j\x00\x00|\x00\x00j\x01\x00\x17S(\x01\x00\x00\x00N(\x02\x00\x00\x00t\x01\x00\x00\x00yt\x01\x00\x00\x00x(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x06\x00\x00\x00showme\x05\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x19\x85q\x1aRq\x1bc__builtin__\n__main__\nh\x18NN}q\x1ctq\x1dRq\x1eU\x01xq\x1fK\x01U\x04saveq h\x0c(h\rU{c\x01\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\r\x00\x00\x00t\x00\x00j\x01\x00|\x00\x00\x83\x01\x00S(\x01\x00\x00\x00N(\x02\x00\x00\x00t\x06\x00\x00\x00picklet\x05\x00\x00\x00dumps(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x04\x00\x00\x00save\x07\x00\x00\x00s\x02\x00\x00\x00\x00\x01q!\x85q"Rq#c__builtin__\n__main__\nh NN}q$tq%Rq&U\x07__doc__q\'NU\x08__init__q(h\x0c(h\rUuc\x02\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\r\x00\x00\x00|\x01\x00|\x00\x00_\x00\x00d\x00\x00S(\x01\x00\x00\x00N(\x01\x00\x00\x00t\x01\x00\x00\x00y(\x02\x00\x00\x00t\x04\x00\x00\x00selfR\x00\x00\x00\x00(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__init__\x03\x00\x00\x00s\x02\x00\x00\x00\x00\x01q)\x85q*Rq+c__builtin__\n__main__\nh(NN}q,tq-Rq.utq/Rq0)\x81q1}q2(U\x01yq3K\x02h\x1fK\x04ub.'
>>>
Then we quit the session, and restart. Pasting in the string from above. (Yes, I could work with a file handle instead, but I'll show that later…)
Python 2.7.9 (default, Dec 11 2014, 01:21:43)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill as pickle
>>>
>>> a = '\x80\x02cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01U\x08TypeTypeq\x02\x85q\x03Rq\x04U\x01Aq\x05h\x01U\nObjectTypeq\x06\x85q\x07Rq\x08\x85q\t}q\n(U\x04loadq\x0bcdill.dill\n_create_function\nq\x0c(cdill.dill\n_unmarshal\nq\rU\xaec\x02\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00C\x00\x00\x00s \x00\x00\x00|\x00\x00j\x00\x00j\x01\x00t\x02\x00j\x03\x00|\x01\x00\x83\x01\x00j\x00\x00\x83\x01\x00\x01d\x00\x00S(\x01\x00\x00\x00N(\x04\x00\x00\x00t\x08\x00\x00\x00__dict__t\x06\x00\x00\x00updatet\x06\x00\x00\x00picklet\x05\x00\x00\x00loads(\x02\x00\x00\x00t\x04\x00\x00\x00selft\x03\x00\x00\x00pik(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x04\x00\x00\x00load\t\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x0e\x85q\x0fRq\x10c__builtin__\n__main__\nh\x0bNN}q\x11tq\x12Rq\x13U\r__slotnames__q\x14]q\x15U\n__module__q\x16U\x08__main__q\x17U\x06showmeq\x18h\x0c(h\rUuc\x01\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x0e\x00\x00\x00|\x00\x00j\x00\x00|\x00\x00j\x01\x00\x17S(\x01\x00\x00\x00N(\x02\x00\x00\x00t\x01\x00\x00\x00yt\x01\x00\x00\x00x(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x06\x00\x00\x00showme\x05\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x19\x85q\x1aRq\x1bc__builtin__\n__main__\nh\x18NN}q\x1ctq\x1dRq\x1eU\x01xq\x1fK\x01U\x04saveq h\x0c(h\rU{c\x01\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\r\x00\x00\x00t\x00\x00j\x01\x00|\x00\x00\x83\x01\x00S(\x01\x00\x00\x00N(\x02\x00\x00\x00t\x06\x00\x00\x00picklet\x05\x00\x00\x00dumps(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x04\x00\x00\x00save\x07\x00\x00\x00s\x02\x00\x00\x00\x00\x01q!\x85q"Rq#c__builtin__\n__main__\nh NN}q$tq%Rq&U\x07__doc__q\'NU\x08__init__q(h\x0c(h\rUuc\x02\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\r\x00\x00\x00|\x01\x00|\x00\x00_\x00\x00d\x00\x00S(\x01\x00\x00\x00N(\x01\x00\x00\x00t\x01\x00\x00\x00y(\x02\x00\x00\x00t\x04\x00\x00\x00selfR\x00\x00\x00\x00(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__init__\x03\x00\x00\x00s\x02\x00\x00\x00\x00\x01q)\x85q*Rq+c__builtin__\n__main__\nh(NN}q,tq-Rq.utq/Rq0)\x81q1}q2(U\x01yq3K\x02h\x1fK\x04ub.'
>>>
>>> pickle.loads(a)
<__main__.A object at 0x105691c50>
>>> b = _
>>>
>>> b.x
4
>>> b.showme()
6
>>> A = b.__class__
>>> c = A(2)
>>> c.x
1
>>> c.showme()
3
Incredibly, the class is rebuilt in __main__
from within the pickled instance. Ok, so now, let's go about changing the class methods to use a new save
and load
that works with files instead of strings.
>>> def save(self, path):
... with open(path, 'w') as f:
... pickle.dump(self, f)
...
>>> def load(self, path):
... with open(path, 'r') as f:
... self.__dict__.update(pickle.load(f).__dict__)
...
>>> A.save = save
>>> A.load = load
>>>
>>> c.save('foo')
>>>
Then we quit the session and restart. Since we don't have a version of A
sitting around, we have to use the load
method directly from pickle
(actually, dill
in this case).
Python 2.7.9 (default, Dec 11 2014, 01:21:43)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill as pickle
>>> with open('foo', 'r') as f:
... a = pickle.load(f)
...
>>> a
<__main__.A object at 0x1028c0b10>
>>> a.x
1
>>> a.showme()
3
>>> a.y = 6
>>> a.showme()
7
>>> a.load('foo')
>>> a.y
2
>>> a.showme()
3
>>>
There might be a better, or more specific way, that you would want to load the state of the class instance, rather than updating the __dict__
. Doing this won't work in all cases, and it's probably better to customize for your class. Were it me, however, I would not have save
and load
methods in the class, but would use the methods provided by your serializer directly. You can see above how awkward/redundant it is to use the load
method from within the class.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With