Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pickle Dump to save an object within the class

Let's say I have a class like this:-

class MyClass:
  some object here
  some other object here
  def init(self, some parameters):
    do something 
  def some_other_method(self, param):
    something else
  def save(self, path):
    PICKLE DUMP THIS OBJECT
  def load(self, path):
    PICKLE LOAD OBJECT

I don't want to pickle load and dump like:

obj = MyClass(param)
pickle.dump(obj, mypath)

But rather like this:

obj.save(mypath)

How can I do this within the class definition?

like image 788
Aditya Avatar asked May 26 '15 02:05

Aditya


People also ask

Can pickle save objects?

The pickle module can store things such as data types such as booleans, strings, and byte arrays, lists, dictionaries, functions, and more. Note: The concept of pickling is also known as serialization, marshaling, and flattening. However, the point is always the same—to save an object to a file for later retrieval.

What is the use of pickle dump in Python?

Python Pickle dump dump() function to store the object data to the file.

How do you save a variable in pickle?

In python, dumps() method is used to save variables to a pickle file.


2 Answers

You can pass self instead of obj. In other words:

def save(self, file_handler):
    pickle.dump(self, file_handler)

The self points to the instance of that class. So what you basically do is calling pickle.dump and passing the instance to it together with the file_handler argument.

like image 72
boaz_shuster Avatar answered Oct 17 '22 02:10

boaz_shuster


Let's build a class A, and try it...

>>> class A(object):
...   x = 1
...   def __init__(self, y):
...     self.y = y
...   def showme(self):
...     return self.y + self.x
...   def save(self):
...     return pickle.dump(self)
...   def load(self, pik):
...     self.__dict__.update(pickle.loads(pik).__dict__)
... 
>>> a = A(2)
>>> a.showme()
3
>>> import pickle
>>>         
>>> a_ = a.save()
>>> a.y = 5
>>> a.showme()
6
>>> a.load(a_)
>>> a.y
2
>>> a.showme()
3
>>> b = A(9)
>>> b.load(a_)
>>> b.y
2
>>> b.showme()
3
>>> b.x = 4
>>> b.showme()
6
>>> b_ = b.save()
>>> a.load(b_)
>>> a.x
4
>>> a.y
2
>>> a.showme()
6
>>> 

However, since you defined the class in __main__, if you were to start over the python interpreter session… your pickles would be useless as the class would no longer exist. That's because python pickles by reference. However, there's a workaround for that. If you use dill, you can pickle your classes by serializing the class definition as well. Then classes defined in __main__ will still be available in a new session.

>>> a.showme()
6
>>> import dill as pickle
>>> a.save()
'\x80\x02cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01U\x08TypeTypeq\x02\x85q\x03Rq\x04U\x01Aq\x05h\x01U\nObjectTypeq\x06\x85q\x07Rq\x08\x85q\t}q\n(U\x04loadq\x0bcdill.dill\n_create_function\nq\x0c(cdill.dill\n_unmarshal\nq\rU\xaec\x02\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00C\x00\x00\x00s \x00\x00\x00|\x00\x00j\x00\x00j\x01\x00t\x02\x00j\x03\x00|\x01\x00\x83\x01\x00j\x00\x00\x83\x01\x00\x01d\x00\x00S(\x01\x00\x00\x00N(\x04\x00\x00\x00t\x08\x00\x00\x00__dict__t\x06\x00\x00\x00updatet\x06\x00\x00\x00picklet\x05\x00\x00\x00loads(\x02\x00\x00\x00t\x04\x00\x00\x00selft\x03\x00\x00\x00pik(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x04\x00\x00\x00load\t\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x0e\x85q\x0fRq\x10c__builtin__\n__main__\nh\x0bNN}q\x11tq\x12Rq\x13U\r__slotnames__q\x14]q\x15U\n__module__q\x16U\x08__main__q\x17U\x06showmeq\x18h\x0c(h\rUuc\x01\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x0e\x00\x00\x00|\x00\x00j\x00\x00|\x00\x00j\x01\x00\x17S(\x01\x00\x00\x00N(\x02\x00\x00\x00t\x01\x00\x00\x00yt\x01\x00\x00\x00x(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x06\x00\x00\x00showme\x05\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x19\x85q\x1aRq\x1bc__builtin__\n__main__\nh\x18NN}q\x1ctq\x1dRq\x1eU\x01xq\x1fK\x01U\x04saveq h\x0c(h\rU{c\x01\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\r\x00\x00\x00t\x00\x00j\x01\x00|\x00\x00\x83\x01\x00S(\x01\x00\x00\x00N(\x02\x00\x00\x00t\x06\x00\x00\x00picklet\x05\x00\x00\x00dumps(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x04\x00\x00\x00save\x07\x00\x00\x00s\x02\x00\x00\x00\x00\x01q!\x85q"Rq#c__builtin__\n__main__\nh NN}q$tq%Rq&U\x07__doc__q\'NU\x08__init__q(h\x0c(h\rUuc\x02\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\r\x00\x00\x00|\x01\x00|\x00\x00_\x00\x00d\x00\x00S(\x01\x00\x00\x00N(\x01\x00\x00\x00t\x01\x00\x00\x00y(\x02\x00\x00\x00t\x04\x00\x00\x00selfR\x00\x00\x00\x00(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__init__\x03\x00\x00\x00s\x02\x00\x00\x00\x00\x01q)\x85q*Rq+c__builtin__\n__main__\nh(NN}q,tq-Rq.utq/Rq0)\x81q1}q2(U\x01yq3K\x02h\x1fK\x04ub.'
>>>

Then we quit the session, and restart. Pasting in the string from above. (Yes, I could work with a file handle instead, but I'll show that later…)

Python 2.7.9 (default, Dec 11 2014, 01:21:43) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill as pickle
>>> 
>>> a = '\x80\x02cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01U\x08TypeTypeq\x02\x85q\x03Rq\x04U\x01Aq\x05h\x01U\nObjectTypeq\x06\x85q\x07Rq\x08\x85q\t}q\n(U\x04loadq\x0bcdill.dill\n_create_function\nq\x0c(cdill.dill\n_unmarshal\nq\rU\xaec\x02\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00C\x00\x00\x00s \x00\x00\x00|\x00\x00j\x00\x00j\x01\x00t\x02\x00j\x03\x00|\x01\x00\x83\x01\x00j\x00\x00\x83\x01\x00\x01d\x00\x00S(\x01\x00\x00\x00N(\x04\x00\x00\x00t\x08\x00\x00\x00__dict__t\x06\x00\x00\x00updatet\x06\x00\x00\x00picklet\x05\x00\x00\x00loads(\x02\x00\x00\x00t\x04\x00\x00\x00selft\x03\x00\x00\x00pik(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x04\x00\x00\x00load\t\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x0e\x85q\x0fRq\x10c__builtin__\n__main__\nh\x0bNN}q\x11tq\x12Rq\x13U\r__slotnames__q\x14]q\x15U\n__module__q\x16U\x08__main__q\x17U\x06showmeq\x18h\x0c(h\rUuc\x01\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x0e\x00\x00\x00|\x00\x00j\x00\x00|\x00\x00j\x01\x00\x17S(\x01\x00\x00\x00N(\x02\x00\x00\x00t\x01\x00\x00\x00yt\x01\x00\x00\x00x(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x06\x00\x00\x00showme\x05\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x19\x85q\x1aRq\x1bc__builtin__\n__main__\nh\x18NN}q\x1ctq\x1dRq\x1eU\x01xq\x1fK\x01U\x04saveq h\x0c(h\rU{c\x01\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\r\x00\x00\x00t\x00\x00j\x01\x00|\x00\x00\x83\x01\x00S(\x01\x00\x00\x00N(\x02\x00\x00\x00t\x06\x00\x00\x00picklet\x05\x00\x00\x00dumps(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x04\x00\x00\x00save\x07\x00\x00\x00s\x02\x00\x00\x00\x00\x01q!\x85q"Rq#c__builtin__\n__main__\nh NN}q$tq%Rq&U\x07__doc__q\'NU\x08__init__q(h\x0c(h\rUuc\x02\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\r\x00\x00\x00|\x01\x00|\x00\x00_\x00\x00d\x00\x00S(\x01\x00\x00\x00N(\x01\x00\x00\x00t\x01\x00\x00\x00y(\x02\x00\x00\x00t\x04\x00\x00\x00selfR\x00\x00\x00\x00(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__init__\x03\x00\x00\x00s\x02\x00\x00\x00\x00\x01q)\x85q*Rq+c__builtin__\n__main__\nh(NN}q,tq-Rq.utq/Rq0)\x81q1}q2(U\x01yq3K\x02h\x1fK\x04ub.'
>>> 
>>> pickle.loads(a)
<__main__.A object at 0x105691c50>
>>> b = _
>>> 
>>> b.x
4
>>> b.showme()
6
>>> A = b.__class__  
>>> c = A(2)
>>> c.x
1
>>> c.showme()
3

Incredibly, the class is rebuilt in __main__ from within the pickled instance. Ok, so now, let's go about changing the class methods to use a new save and load that works with files instead of strings.

>>> def save(self, path):
...   with open(path, 'w') as f:        
...     pickle.dump(self, f)
... 
>>> def load(self, path):
...   with open(path, 'r') as f:
...     self.__dict__.update(pickle.load(f).__dict__)
... 
>>> A.save = save
>>> A.load = load
>>> 
>>> c.save('foo')
>>> 

Then we quit the session and restart. Since we don't have a version of A sitting around, we have to use the load method directly from pickle (actually, dill in this case).

Python 2.7.9 (default, Dec 11 2014, 01:21:43) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill as pickle
>>> with open('foo', 'r') as f:
...   a = pickle.load(f)
... 
>>> a 
<__main__.A object at 0x1028c0b10>
>>> a.x
1
>>> a.showme()
3
>>> a.y = 6
>>> a.showme()
7
>>> a.load('foo')
>>> a.y    
2
>>> a.showme()
3
>>> 

There might be a better, or more specific way, that you would want to load the state of the class instance, rather than updating the __dict__. Doing this won't work in all cases, and it's probably better to customize for your class. Were it me, however, I would not have save and load methods in the class, but would use the methods provided by your serializer directly. You can see above how awkward/redundant it is to use the load method from within the class.

like image 1
Mike McKerns Avatar answered Oct 17 '22 02:10

Mike McKerns