Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pickling unpicklable objects

I am making a drawing program with pygame in which I want to give the user an option of saving the exact state of the program and then reloading it at a later time. At this point I save a copy of my globals dict and then iterate through, pickling every object. There are some objects in pygame that cannot be pickled, but can be converted into strings and pickled that way. My code is set up to do this, but some of these unpicklable objects are being reached by reference. In other words, they aren't in the global dictionary, but they are referenced by objects in the global dictionary. I want to pickle them in this recursion, but I don't know how to tell pickle to return the object it had trouble with, change it, then try to pickle it again. My code is really very kludge, if there's a different, superior way to do what I'm trying to do, let me know.


surfaceStringHeader = 'PYGAME.SURFACE_CONVERTED:'
imageToStringFormat = 'RGBA'
def save_project(filename=None):
    assert filename != None, "Please specify path of project file"
    pickler = pickle.Pickler(file(filename,'w'))
    for key, value in globals().copy().iteritems():
        #There's a bit of a kludge statement here since I don't know how to 
        #access module type object directly
        if type(value) not in [type(sys),type(None)]   and \
        key not in ['__name__','value','key']          and \
        (key,value) not in pygame.__dict__.iteritems() and \
        (key,value) not in sys.__dict__.iteritems()    and \
        (key,value) not in pickle.__dict__.iteritems(): 
        #Perhaps I should add something to the above to reduce redundancy of
        #saving the program defaults?
            #Refromat unusable objects:
            if type(value)==pygame.Surface:
                valueString = pygame.image.tostring(value,imageToStringFormat)
                widthString = str(value.get_size()[0]).zfill(5)
                heightString = str(value.get_size()[1]).zfill(5)
                formattedValue = surfaceStringHeader+widthString+heightString+valueString
            else:
                formattedValue = value

            try:
                pickler.dump((key,formattedValue))
            except Exception as e:
                print key+':' + str(e)

def open_project(filename=None):
    assert filename != None, "Please specify path to project file"
    unpickler = pickle.Unpickler(file(filename,'r'))
    haventReachedEOF = False
    while haventReachedEOF:
        try:
            key,value = unpickler.load()
            #Rework the unpicklable objects stored 
            if type(value) == str and value[0:25]==surfaceStringHeader:
                value = pygame.image.frombuffer(value[36:],(int(value[26:31]),int(value[31:36])),imageToStringFormat)
            sys.modules['__main__'].__setattr__(key,value)
        except EOFError:
            haventReachedEOF = True
like image 741
hedgehogrider Avatar asked Dec 03 '12 20:12

hedgehogrider


4 Answers

In short: Don't do this.

Pickling everything in your application is messy and likely to cause problems. Take the data you need from your program and store it in an appropriate data format manually, then load it by creating the things you need back from that data.

like image 136
Gareth Latty Avatar answered Nov 02 '22 16:11

Gareth Latty


You want to save the state of your entire program so that it can be reloaded at a later time. This is a perfect use case for Pickle, I don't see a problem with the use case at all. However your approach to pickling the globals() namespace and filtering out sys, pygame and pickle is wonky. The usual pattern is to have one session object which you pickle.

Also I think there might be some confusion with how to pickle:

  1. When you pickle an object, all of the objects referenced by its member variables will be pickled/unpickled automatically, which is good
  2. If pickle cannot serialize an object, you should tell pickle how to save and restore that object by writing custom getstate and setstate methods for any objects that don't pickle, so one or two of your classes that are nested inside your master session object will have custom get/setstate functions to do things like reopen devices like filehandles that will obviously be different between sessions
  3. If you need to do a binary serialization you don't need to cast the object into a string, just use the binary serialization protocol in that object's get/setstate method, (ie use Protocol 1)

In the end your code should look more like this:

session = None
import pickle
def startsession():
    globals session
    session = pickle.Unpickler(sessionfilehandle('r')).load()
    if session is None: session = Session() 

def savesession(filename=None):
    globals session
    pickle.Pickler.dump(session,sessionfilehandle('w'))

class Session(object):
    def __init__(self):
        self.someobject=NewObject1()
        #.... plus whole object tree representing the whole game
        self.somedevicehandlethatcannotbepickled=GetDeviceHandle1()  #for example
    def __getstate__(self):
        odict = self.__dict__.copy()
        del odict['somedevicehandlethatcannotbepickled'] #don't pickle this
        return odict
    def __setstate__(self, dict):
        self.__dict__.update(dict)
        self.somedevicehandlethatcannotbepickled=GetDeviceHandle1()
like image 32
Riaz Rizvi Avatar answered Nov 02 '22 15:11

Riaz Rizvi


From your comments, it sounds like the hard part of what you're trying to do is to give the user a live interpreter, and save the state of that.

So, what about running that live interpreter as a subprocess? Any information from your object model that you want to expose to scripting, you do so explicitly (whether by multiprocessing shared memory, or some kind of message-passing API).

Then, you don't need to save the complete state of your own interpreter, which is either very hard or impossible; you save your data model in the normal way, and then you can freeze the sub-interpreter from the outside rather than the inside.

This is obviously a lot more complicated than what you're trying to do, but I don't think that anything simple is actually going to work. For example, if the user has a live interpreter with your code, they can monkeypatch anything—even the pickling code—and then what happens? You need to define some limitations on exactly what can be saved and restored—and, if those limitations are broad enough, I think you have to do it from outside.

Meanwhile, as mentioned in a comment, both scipy (or some associated project that comes with Enthought) and ipython have save-and-restore functionality for limited use cases, which at least gives you some code to study, but their use cases may not be the same as yours.

like image 28
abarnert Avatar answered Nov 02 '22 16:11

abarnert


If you know all the unpickleable object types then the code in the answer to this question might be helpful " Recursively dir() a python object to find values of a certain type or with a certain value " -- I wrote it in response to a similar situation where I knew all the unpickleable object types but I couldn't know where they were within a data structure. You could use this code to find them, replace them with something else then on unpickling use similar code to put them back.

like image 32
Michael Scott Asato Cuthbert Avatar answered Nov 02 '22 15:11

Michael Scott Asato Cuthbert