Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pickle: dealing with updated class definitions

After a class definition is updated by recompiling a script, pickle refuses to serialize previously instantiated objects of that class, giving the error: "Can't pickle object: it's not the same object as "

Is there a way to tell pickle that it should ignore such cases? To just identify classes by name, ignore whichever internal unique ID is causing the mismatch?

I would definitely welcome as an answer the suggestion of an alternative, equivalent module which solves this problem in a convenient and robust manner.


For reference, here's my motivation:

I am creating a high productivity, rapid iteration development environment in which Python scripts are edited live. Scripts are repeatedly recompiled, but data persists across compiles. As part of the productivity goals, I am trying to use pickle for serialization, to avoid the cost of writing and updating explicit serialization code for constantly changing data structures.

Mostly I serialize built-in types. I am careful to avoid meaningful changes in the classes which I pickle, and when necessary I use the copy_reg.pickle mechanism to perform upconversion on unpickle.

Script recompilation prevents me from pickling objects at all, even if class definitions have not actually changed (or have only changed in a benign way).

like image 206
iestyn Avatar asked Apr 28 '13 23:04

iestyn


People also ask

Can a dictionary be pickled?

In general, pickling a dict will fail unless you have only simple objects in it, like strings and integers. Even a really simple dict will often fail. It just depends on the contents.

How do you pickle data in Python?

To use pickle, start by importing it in Python. To pickle this dictionary, you first need to specify the name of the file you will write it to, which is dogs in this case. Note that the file does not have an extension. To open the file for writing, simply use the open() function.

Why are pythons called Pickles?

"Pickling" is process which enables storage and preservation. "Pickle" is "Pickle" because "Python" is "Python".

What is pickling and Unpickling with example?

Pickling: It is a process where a Python object hierarchy is converted into a byte stream. Unpickling: It is the inverse of Pickling process where a byte stream is converted into an object hierarchy.


1 Answers

Unless you can unpack the earlier version of the class definition, the reference pickle needs to dump and load the instance is now gone. So this is "not possible".

However, if you did want to do it, you could save previous versions of your class definitions... and then it would just be that you'd have to trick pickle into referring to your old/saved class definitions, and not using the most current ones -- which might just amount to editing obj.__class__ or obj.__module__ to point to your old class. There may also be some other odd things in your class instance that also refer to the old class definition that you'd have to handle. Also, if you add or delete a class method, you may run in to some unexpected results, or have to deal with updating the instance accordingly. Another interesting twist is that you could make the unpickler always use the most current version of your class.

My serialization package, dill, has some methods that can dump compiled source from a live code object to a temporary file, and then serialize using that temporary file. It's one of the newer parts of the package, so it's not as robust as the rest of dill. Also, your use case is not a use case I'd considered, but I could see how it would be a nice feature to have.

like image 82
Mike McKerns Avatar answered Oct 01 '22 20:10

Mike McKerns