Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get unpickling to work with iPython?

I'm trying to load pickled objects in iPython.

The error I'm getting is:

AttributeError: 'FakeModule' object has no attribute 'World'

Anybody know how to get it to work, or at least a workaround for loading objects in iPython in order to interactively browse them?

Thanks

edited to add:

I have a script called world.py that basically does:

import pickle
class World:
    ""
if __name__ == '__main__':
    w = World()
    pickle.dump(w, open("file", "wb"))

Than in a REPL I do:

import pickle  
from world import World  
w = pickle.load(open("file", "rb"))

which works in the vanilla python REPL but not with iPython.

I'm using Python 2.6.5 and iPython 0.10 both from the Enthought Python Distribution but I was also having the problem with previous versions.

like image 594
Nils Fagerburg Avatar asked Aug 07 '10 17:08

Nils Fagerburg


People also ask

Which method is used for Unpickling?

The load() method is used to unpickle data from a binary file that has been compressed.

What is Unpickling in Python?

“Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy.


2 Answers

Looks like you've modified FakeModule between the time you pickled your data, and the time you're trying to unpickle it: specifically, you have removed from that module some top-level object named World (perhaps a class, perhaps a function).

Pickling serializes classes and function "by name", so they need to be names at their module's top level and that module must not be modified (at least not in such way to affect those names badly -- definitely not by removing those names from the module!) between pickling time and unpickling time.

Once you've identified exactly what change you've done that impedes the unpickling, it can often be hacked around if for other reasons you can't just revert the change. For example, if you've just moved World from FakeModule to CoolModule, do:

import FakeModule
import CoolModule
FakeModule.World = CoolModule.World

just before unpickling (and remember to pickle again with the new structure so you won't have to keep repeating these hacks every time you unpickle;-).

Edit: the OP's edit of the Q makes his error much easier to understand. Since he's now testing if __name__ equals '__main__', this makes it obvious that the pickle, when written, will be saving an object of class __main__.World. Since he's using ASCII pickles (a very bad choice for performance and disk space, by the way), it's trivial to check:

$ cat file
(i__main__
World
p0
(dp1

the module being looked up is (clearly and obviously) __main__. Now, without even bothering ipython but with a simple Python interactive interpreter:

$ py26
Python 2.6.5 (r265:79359, Mar 24 2010, 01:32:55) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import world
>>> import pickle
>>> pickle.load(open("file", "rb"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/pickle.py", line 1370, in load
    return Unpickler(file).load()
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/pickle.py", line 858, in load
    dispatch[key](self)
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/pickle.py", line 1069, in load_inst
    klass = self.find_class(module, name)
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/pickle.py", line 1126, in find_class
    klass = getattr(mod, name)
AttributeError: 'module' object has no attribute 'World'
>>> 

the error can be easily reproduced, and its reason is just as obvious: the module in which the class name's lookup is performed (that is, __main__) does indeed have no attribute named "World". Module world does have one, but the OP has not "connected the dots" as I explained in the previous part of the answer, putting a reference with the right name in the module in which the pickled file needs it. That is:

>>> World = world.World
>>> pickle.load(open("file", "rb"))
<world.World instance at 0xf5300>
>>> 

now this works just perfectly, of course (and as I'd said earlier). Perhaps the OP is not seeing this problem because he's using the form of import I detest, from world import World (importing directly a function or class from within a module, rather than the module itself).

The hack to work around the problem in ipython is exactly the same in terms of underlying Python architecture -- just requires a couple more lines of code because ipython, to supply all of its extra services, does not make module __main__ directly available to record directly what happens at the interactive command line, but rather interposes one (called FakeModule, as the OP found out from the error msg;-) and does black magic with it in order to be "cool" &c. Still, whenever you want to get directly to a module with a given name, it's pretty trivial in Python, of course:

In [1]: import world

In [2]: import pickle

In [3]: import sys

In [4]: sys.modules['__main__'].World = world.World

In [5]: pickle.load(open("file", "rb"))
Out[5]: <world.World instance at 0x118fc10>

In [6]: 

Lesson to retain, number one: avoid black magic, at least unless and until you're good enough as a sorcerer's apprentice to be able to spot and fix its occasional runaway situations (otherwise, those bucket-carrying brooms may end up flooding the world while you nap;-).

Or, alternative reading: to properly use a certain layer of abstraction (such as the "cool" ones ipython puts on top of Python) you need strong understanding of the underlying layer (here, Python itself and its core mechanisms such as pickling and sys.modules).

Lesson number two: that pickle file is essentially broken, due to the way you've written it, because it can be loaded only when module __main__ has a class by name Word, which of course it normally will not have without some hacks like the above. The pickle file should instead record the class as living in module world. If you absolutely feel you must produce the file on an if __name__ == '__main__': clause in world.py, then use some redundancy for the purpose:

import pickle
class World:
    ""
if __name__ == '__main__':
    import world
    w = world.World()
    pickle.dump(w, open("file", "wb"))

this works fine and without hacks (at least if you follow the Python best practice of never having any substantial code at module top level -- only imports, class, def, and trivial assignments -- everything else belongs in functions; if you haven't followed this best practice, then edit your code to do so, it will make you much happier in terms of both flexibility and performance).

like image 135
Alex Martelli Avatar answered Nov 07 '22 13:11

Alex Martelli


When you pickle w in the __main__ module with pickle.dump(w, open("file", "wb")), the fact that w comes from the __main__ module is recorded on the first line of file:

% xxd file
0000000: 2869 5f5f 6d61 696e 5f5f 0a57 6f72 6c64  (i__main__.World
0000010: 0a70 300a 2864 7031 0a62 2e              .p0.(dp1.b.

When IPython tries to unpickle file, it executes these lines:

/usr/lib/python2.6/pickle.pyc in find_class(self, module, name)
   1124         __import__(module)
   1125         mod = sys.modules[module]
-> 1126         klass = getattr(mod, name)
   1127         return klass
   1128 

In particular, it tries to execute __import__('__main__'). If you try that in the REPL, you get

In [29]: fake=__import__('__main__')

In [32]: fake
Out[32]: <module '__main__' from '/usr/lib/pymodules/python2.6/IPython/FakeModule.pyc'>

This is the FakeModule that IPython mentions in the AttributeError.

If you look inside fake.__dict__ you'll see it doesn't include World even if you say from test import World before or after the __import__.

If you run

In [35]: fake.__dict__['World']=World

Then pickle.load will work:

In [37]: w = pickle.load(open("file", "rb"))

There might be a cleaner way; I don't know. Any way you can think of that puts World in the fake namespace should work.

PS. In 2008 Fernando Perez, the creator of IPython, wrote a little bit on this issue. He might have fixed this in some way that avoid my dirty hack. You might want to ask on the IPython-user mailing list, or, perhaps simpler, just don't pickle inside the __main__ namespace.

like image 33
unutbu Avatar answered Nov 07 '22 13:11

unutbu