Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple example of use of __setstate__ and __getstate__

People also ask

What is __ Getstate __ Python?

__getstate__ should return object (representing class state) which will be pickled and saved. __setstate__ should take object from parameter and use it to retrieve class state as it was before.

What is the use of pickling in Python?

Pickle in Python is primarily used in serializing and deserializing a Python object structure. In other words, it's the process of converting a Python object into a byte stream to store it in a file/database, maintain program state across sessions, or transport data over the network.


Here's a very simple example for Python that should supplement the pickle docs.

class Foo(object):
  def __init__(self, val=2):
     self.val = val
  def __getstate__(self):
     print("I'm being pickled")
     self.val *= 2
     return self.__dict__
  def __setstate__(self, d):
     print("I'm being unpickled with these values: " + repr(d))
     self.__dict__ = d
     self.val *= 3

import pickle
f = Foo()
f_data = pickle.dumps(f)
f_new = pickle.loads(f_data)

Minimal example

Whatever comes out of getstate, goes into setstate. It does not need to be a dict.

Whatever comes out of getstate must be pickeable, e.g. made up of basic built-ins like int, str, list.

class C(object):
    def __init__(self, i):
        self.i = i
    def __getstate__(self):
        return self.i
    def __setstate__(self, i):
        self.i = i
assert pickle.loads(pickle.dumps(C(1), -1)).i == 1

Default __setstate__

The default __setstate__ takes a dict.

self.__dict__ is a good choice as in https://stackoverflow.com/a/1939384/895245 , but we can construct one ourselves to better see what is going on:

class C(object):
    def __init__(self, i):
        self.i = i
    def __getstate__(self):
        return {'i': self.i}
assert pickle.loads(pickle.dumps(C(1), -1)).i == 1

Default __getstate__

Analogous to __setstate__.

class C(object):
    def __init__(self, i):
        self.i = i
    def __setstate__(self, d):
        self.i = d['i']
assert pickle.loads(pickle.dumps(C(1), -1)).i == 1

__slots__ objects don't have __dict__

If the object has __slots__, then it does not have __dict__

If you are going to implement both get and setstate, the default-ish way is:

class C(object):
    __slots__ = 'i'
    def __init__(self, i):
        self.i = i
    def __getsate__(self):
        return { slot: getattr(self, slot) for slot in self.__slots__ }
    def __setsate__(self, d):
        for slot in d:
            setattr(self, slot, d[slot])
assert pickle.loads(pickle.dumps(C(1), -1)).i == 1

__slots__ default get and set expects a tuple

If you want to reuse the default __getstate__ or __setstate__, you will have to pass tuples around as:

class C(object):
    __slots__ = 'i'
    def __init__(self, i):
        self.i = i
    def __getsate__(self):
        return (None, { slot: getattr(self, slot) for slot in self.__slots__ })
assert pickle.loads(pickle.dumps(C(1), -1)).i == 1

I'm not sure what this is for.

Inheritance

First see that pickling works by default:

class C(object):
    def __init__(self, i):
        self.i = i
class D(C):
    def __init__(self, i, j):
        super(D, self).__init__(i)
        self.j = j
d = pickle.loads(pickle.dumps(D(1, 2), -1))
assert d.i == 1
assert d.j == 2

Inheritance custom __getstate__

Without __slots__ it is easy, since the __dict__ for D contains the __dict__ for C, so we don't need to touch C at all:

class C(object):
    def __init__(self, i):
        self.i = i
class D(C):
    def __init__(self, i, j):
        super(D, self).__init__(i)
        self.j = j
    def __getstate__(self):
        return self.__dict__
    def __setstate__(self, d):
        self.__dict__ = d
d = pickle.loads(pickle.dumps(D(1, 2), -1))
assert d.i == 1
assert d.j == 2

Inheritance and __slots__

With __slots__, we need to forward to the base class, and can pass tuples around:

class C(object):
    __slots__ = 'i'
    def __init__(self, i):
        self.i = i
    def __getstate__(self):
        return { slot: getattr(self, slot) for slot in C.__slots__ }
    def __setstate__(self, d):
        for slot in d:
            setattr(self, slot, d[slot])

class D(C):
    __slots__ = 'j'
    def __init__(self, i, j):
        super(D, self).__init__(i)
        self.j = j
    def __getstate__(self):
        return (
            C.__getstate__(self),
            { slot: getattr(self, slot) for slot in self.__slots__ }
        )
    def __setstate__(self, ds):
        C.__setstate__(self, ds[0])
        d = ds[1]
        for slot in d:
            setattr(self, slot, d[slot])

d = pickle.loads(pickle.dumps(D(1, 2), -1))
assert d.i == 1
assert d.j == 2

Unfortunately it is not possible to reuse the default __getstate__ and __setstate__ of the base: https://groups.google.com/forum/#!topic/python-ideas/QkvOwa1-pHQ we are forced to define them.

Tested on Python 2.7.12. GitHub upstream.


These methods are used for controlling how objects are pickled and unpickled by the pickle module. This is usually handled automatically, so unless you need to override how a class is pickled or unpickled you shouldn't need to worry about it.


A clarification to @BrainCore's answer. In practice, you probably won't want to modify self inside __getstate__. Instead construct a new object that will get pickled, leaving the original unchanged for further use. Here's what that would look like:

import pickle

class Foo:
    def __init__(self, x:int=2, y:int=3):
        self.x = x
        self.y = y
        self.z = x*y

    def __getstate__(self):
        # Create a copy of __dict__ to modify values and return;
        # you could also construct a new dict (or other object) manually
        out = self.__dict__.copy()
        out["x"] *= 3
        out["y"] *= 10
        # You can remove attributes, but note they will not get set with
        # some default value in __setstate__ automatically; you would need
        # to write a custom __setstate__ method yourself; this might be
        # useful if you have unpicklable objects that need removing, or perhaps
        # an external resource that can be reloaded in __setstate__ instead of
        # pickling inside the stream
        del out["z"]
        return out

    # The default __setstate__ will update Foo's __dict__;
    # so no need for a custom implementation here if __getstate__ returns a dict;
    # Be aware that __init__ is not called by default; Foo.__new__ gets called,
    # and the empty object is modified by __setstate__

f = Foo()
f_str = pickle.dumps(f)
f2 = pickle.loads(f_str)

print("Pre-pickle:", f.x, f.y, hasattr(f,"z"))
print("Post-pickle:", f2.x, f2.y, hasattr(f2,"z"))