Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pickling dynamically created types

Tags:

python

pickle

I've been trying to get some dynamically created types (i.e. ones created by calling 3-arg type()) to pickle and unpickle nicely. I've been using this module switching trick to hide the details from users of the module and give clean semantics.

I've learned several things already:

  1. The type must be findable with getattr on the module itself
  2. The type must be consistent with what getattr finds, that is to say if we call pickle.dumps(o) then it must be true that type(o) == getattr(module, 'name of type')

Where I'm stuck though is that there still seems to be something odd going on - it seems to be calling __getstate__ on something unexpected.

Here's the simplest setup I've got that reproduces the issue, testing with Python 3.5, but I'd like to target back to 3.3 if possible:

# module.py
import sys
import functools

def dump(self):
    return b'Some data' # Dummy for testing

def undump(self, data):
    print('Undump: %r' % data) # Do nothing for testing

# Cheaty demo way to make this consistent
@functools.lru_cache(maxsize=None)
def make_type(name):
    return type(name, (), {
        '__getstate__': dump,
        '__setstate__': undump,
    })

class Magic(object):
    def __init__(self, path):
        self.path = path

    def __getattr__(self, name):
        print('Getting thing: %s (from: %s)' % (name, self.path))
        # for simple testing all calls to make_type must end in last x.y.z.last
        if name != 'last':
            if self.path:
                return Magic(self.path + '.' + name)
            else:
                return Magic(name)
        return make_type(self.path + '.' + name)

# Make the switch
sys.modules[__name__] = Magic('')

And then a quick way to exercise that:

import module
import pickle

f=module.foo.bar.woof.last()
print(f.__getstate__()) # See, *this* works
print('Pickle starts here')
print(pickle.dumps(f))

Which then gives:

Getting thing: foo (from: )
Getting thing: bar (from: foo)
Getting thing: woof (from: foo.bar)
Getting thing: last (from: foo.bar.woof)
b'Some data'
Pickle starts here
Getting thing: __spec__ (from: )
Getting thing: _initializing (from: __spec__)
Getting thing: foo (from: )
Getting thing: bar (from: foo)
Getting thing: woof (from: foo.bar)
Getting thing: last (from: foo.bar.woof)
Getting thing: __getstate__ (from: foo.bar.woof)
Traceback (most recent call last):
  File "test.py", line 7, in <module>
    print(pickle.dumps(f))
TypeError: 'Magic' object is not callable

I wasn't expecting to see anything looking up __getstate__ on module.foo.bar.woof, but even if we force that lookup to fail by adding:

if name == '__getstate__': raise AttributeError()

into our __getattr__ it still fails with:

Traceback (most recent call last):
  File "test.py", line 7, in <module>
    print(pickle.dumps(f))
_pickle.PicklingError: Can't pickle <class 'module.Magic'>: it's not the same object as module.Magic

What gives? Am I missing something with __spec__? The docs for __spec__ pretty much just stress setting it appropriately, but don't seem to actually explain much.

More importantly the bigger question is how am I supposed to go about making types I programatically generated via a pseudo module's __getattr__ implementation pickle properly?

(And obviously once I've managed to get pickle.dumps to produce something I expect pickle.loads to call undump with the same thing)

like image 606
Flexo Avatar asked Sep 08 '17 21:09

Flexo


People also ask

What is Pickle method explain with example?

Pickling is a way to convert a python object (list, dict, etc.) into a character stream. The idea is that this character stream contains all the information necessary to reconstruct the object in another python script. # Python3 program to illustrate store. # efficiently using pickle module.

What is __ reduce __ in Python?

Whenever an object is pickled, the __reduce__ method defined by it gets called. This method returns either a string, which may represent the name of a Python global, or a tuple describing how to reconstruct this object when unpickling.

What is pickling in Django?

Pickle in Python is primarily used in serializing and deserializing a Python object structure. In other words, it's the process of converting a Python object into a byte stream to store it in a file/database, maintain program state across sessions, or transport data over the network.


1 Answers

To pickle f, pickle needs to pickle f's class, module.foo.bar.woof.last.

The docs don't claim support for pickling arbitrary classes. They claim the following:

The following types can be pickled:

  • ...
  • classes that are defined at the top level of a module

module.foo.bar.woof.last isn't defined at the top level of a module, even a pretend module like module. In this not-officially-supported case, the pickle logic ends up trying to pickle module.foo.bar.woof, either here:

    elif parent is not module:
        self.save_reduce(getattr, (parent, lastname))

or here

    else if (parent != module) {
        PickleState *st = _Pickle_GetGlobalState();
        PyObject *reduce_value = Py_BuildValue("(O(OO))",
                                    st->getattr, parent, lastname);
        status = save_reduce(self, reduce_value, NULL);

module.foo.bar.woof can't be pickled for multiple reasons. It returns a non-callable Magic instance for all unsupported method lookups, like __getstate__, which is where your first error comes from. The module-switching thing prevents finding the Magic class to pickle it, which is where your second error comes from. There are probably more incompatibilities.

like image 54
user2357112 supports Monica Avatar answered Oct 17 '22 02:10

user2357112 supports Monica