I have a namedtuple type defined inside a module consisting of two classes, foo and bar, defined in the module's only file, mod.py
. I am able to create instances of both foo and bar without issue and pickle them. I am now trying to Cythonize it so that I can distribute the module as bytecode.
The module file structure looks like:
./mod.pyx
./setup.py
./demo.py
The content of `mod.pyx' is:
import collections
foo = collections.namedtuple('foo', 'A B')
class bar:
def __init__(self,A,B):
self.A = A
self.B = B
The content of setup.py
is:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
setup(
ext_modules= cythonize([Extension('mod', ['mod.pyx'])])
)
I cythonize it using the command python setup.py build_ext --inplace
, which creates the compiled module file:
./mod.cp37-win_amd64.pyd
Running the following demo.py
:
import mod, pickle
ham = mod.foo(1,2)
spam = mod.bar(1,2)
print(pickle.dumps(spam))
print(pickle.dumps(ham))
Successfully pickles spam
, the instance of class bar
, but fails on ham
, the instance of namedtuple foo
, with the error message:
PicklingError: Can't pickle <class 'importlib._bootstrap.foo'>: attribute lookup foo on importlib._bootstrap failed
This is all done in Python 3.7, if it matters. It seems like Pickle can no longer find the class definition of mod.foo
, even though Python is able to create an instance without issue. I know namedtuple has some weird behavior with respect to naming of the class it returns, and I admit I am a relative novice at packaging Cython modules.
A bit of googling turned up a few known issues with namedtuples and Cython, so I'm wondering if this might be part of a known issue, or if I am just packaging my module incorrectly.
In order for pickle
to work, the attribute __module__
of the foo
-type must be set and should be mod
.
namedtuple
uses a trick/heuristic (i.e lookup in sys._getframe(1).f_globals
) to get this information:
def namedtuple(typename, field_names, *, rename=False, defaults=None, module=None):
...
# For pickling to work, the __module__ variable needs to be set to the frame
# where the named tuple is created. Bypass this step in environments where
# sys._getframe is not defined (Jython for example) or sys._getframe is not
# defined for arguments greater than 0 (IronPython), or where the user has
# specified a particular module.
if module is None:
try:
module = _sys._getframe(1).f_globals.get('__name__', '__main__')
except (AttributeError, ValueError):
pass
if module is not None:
result.__module__ = module
...
The problem with the Cython- or C-extensions is that, this heuristic will not work and _sys._getframe(1).f_globals.get('__name__', '__main__')
will yield importlib._bootstrap
and not mod
.
To fix that you need to pass right module
-name to namedtuple
-factory (as pointed out in the code-comments), i.e.:
foo = collections.namedtuple('foo', 'A B', module='mod')
or to keep it more generic:
foo = collections.namedtuple('foo', 'A B', module=__name__)
Now, after importing, foo.__module__
is mod
as expected by pickle
and ham
can be pickled.
By the way, pickling of bar
functions, because Cython explicitly sets the right __module__
attribute (i.e. mod
), while constructing the class.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With