Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pickle: how does it pickle a function?

In a post I posted yesterday, I accidentally found changing the __qualname__ of a function has an unexpected effect on pickle. By running more tests, I found that when pickling a function, pickle does not work in the way I thought, and changing the __qualname__ of the function has a real effect on how pickle behaves.

The snippets below are tests I ran,

import pickle
from sys import modules

# a simple function to pickle 
def hahaha(): return 1

print('hahaha',hahaha,'\n')

# change the __qualname__ of function hahaha
hahaha.__qualname__ = 'sdfsdf'
print('set hahaha __qualname__ to sdfsdf',hahaha,'\n')

# make a copy of hahaha
setattr(modules['__main__'],'abcabc',hahaha)
print('create abcabc which is just hahaha',abcabc,'\n')

try:
    pickle.dumps(hahaha)
except Exception as e:
    print('pickle hahaha')
    print(e,'\n')

try:
    pickle.dumps(abcabc)
except Exception as e:
    print('pickle abcabc, a copy of hahaha')
    print(e,'\n')

try:
    pickle.dumps(sdfsdf)
except Exception as e:
    print('pickle sdfsdf')
    print(e)

As you can see by running the snippets, both hahaha and abcabc cannot be pickled because of the exception:

Can't pickle <function sdfsdf at 0x7fda36dc5f28>: attribute lookup sdfsdf on __main__ failed.

I'm really confused by this exception,

  1. What does pickle look for when it pickles a function? Although the __qualname__ of hahaha was changed to 'sdfsdf', the function hahaha as well as its copy abcabc is still callable in the session (as they are in dir(sys.modules['__main__'])), then why pickle cannot pickle them?

  2. What is the real effect of changing the __qualname__ of a function? I understand by changing the __qualname__ of hahaha to 'sdfsdf' won't make sdfsdf callable, as it won't show up in dir(sys.modules['__main__']). However, as you can see by running the snippets, after changing the __qualname__ of hahaha to 'sdfsdf', the object hahaha as well as its copy abcabc has changed to something like <function sdfsdf at 'some_address'>. What is the difference between the objects in sys.modules['__main__'] and <function sdfsdf at 'some_address'>?

like image 342
meTchaikovsky Avatar asked Sep 28 '20 02:09

meTchaikovsky


People also ask

What is the function of pickle?

Pickle is used for serializing and de-serializing Python object structures, also called marshalling or flattening. Serialization refers to the process of converting an object in memory to a byte stream that can be stored on disk or sent over a network.

How does pickle work in Python?

“Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy.

How do you pickle something in Python?

First, import pickle to use it, then we define an example dictionary, which is a Python object. Next, we open a file (note that we open to write bytes in Python 3+), then we use pickle. dump() to put the dict into opened file, then close. Use pickle.

How do you pickle a model?

To save the ML model using Pickle all we need to do is pass the model object into the dump() function of Pickle. This will serialize the object and convert it into a “byte stream” that we can save as a file called model. pkl .


1 Answers

Pickling of function objects is defined in the save_global method in pickle.py:

First, the name of the function is retrieved via __qualname__:

name = getattr(obj, '__qualname__', None)

Afterwards, after retrieving the module, it is reimported:

__import__(module_name, level=0)
module = sys.modules[module_name]

This freshly imported module is then used to look up the function as an attribute:

obj2, parent = _getattribute(module, name)

obj2 would now be a new copy of the function, but since sdfsdf doesn't exist in this module, pickling fails here.


You can make this work, but you have to be consistent:

>>> import sys
>>> import pickle
>>> def hahaha(): return 1
>>> hahaha.__qualname__ = "sdfsdf"
>>> setattr(sys.modules["__main__"], "sdfsdf", hahaha)
>>> pickle.dumps(hahaha)
b'\x80\x04\x95\x17\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x06sdfsdf\x94\x93\x94.'
like image 98
L3viathan Avatar answered Oct 03 '22 06:10

L3viathan