I have a python program that loads quite a bit of data before running. As such, I'd like to be able to reload code without reloading data. With regular python, importlib.reload
has been working fine. Here's an example:
setup.py:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
extensions = [
Extension("foo.bar", ["foo/bar.pyx"],
language="c++",
extra_compile_args=["-std=c++11"],
extra_link_args=["-std=c++11"])
]
setup(
name="system2",
ext_modules=cythonize(extensions, compiler_directives={'language_level' : "3"}),
)
foo/bar.py
cpdef say_hello():
print('Hello!')
runner.py:
import pyximport
pyximport.install(reload_support=True)
import foo.bar
import subprocess
from importlib import reload
if __name__ == '__main__':
def reload_bar():
p = subprocess.Popen('python setup.py build_ext --inplace',
shell=True,
cwd='<your directory>')
p.wait()
reload(foo.bar)
foo.bar.say_hello()
But this doesn't seem to work. If I edit bar.pyx and run reload_bar
I don't see my changes. I also tried pyximport.build_module()
with no luck -- the module rebuilt but didn't reload. I'm running in a "normal" python shell, not IPython if it makes a difference.
I was able to get a solution working for Python 2.x a lot easier than Python 3.x. For whatever reason, Cython seems to be caching the shareable object (.so
) file it imports your module from, and even after rebuilding and deleting the old file while running, it still imports from the old shareable object file. However, this isn't necessary anyways (when you import foo.bar
, it doesn't create one), so we can just skip this anyways.
The largest problem was that python kept a reference to the old module, even after reload
ing. Normal python modules seem to work find, but not anything cython related. To fix this, I run execute two statements in place of reload(foo.bar)
del sys.modules['foo.bar']
import foo.bar
This successfully (though probably less efficiently) reloads the cython module. The only issue that remains in in Python 3.x running that subprocess creates a problematic shareable objects. Instead, skip that all together and let the import foo.bar
work its magic with the pyximporter
module, and recompile for you. I also added an option to the pyxinstall
command to specify the language level to match what you've specified in the setup.py
pyximport.install(reload_support=True, language_level=3)
So all together:
runner.py
import sys
import pyximport
pyximport.install(reload_support=True, language_level=3)
import foo.bar
if __name__ == '__main__':
def reload_bar():
del sys.modules['foo.bar']
import foo.bar
foo.bar.say_hello()
input(" press enter to proceed ")
reload_bar()
foo.bar.say_hello()
Other two files remained unchanged
Running:
Hello!
press enter to proceed
-replace "Hello!"
in foo/bar.pyx with "Hello world!"
, and press Enter.
Hello world!
Cython-extensions are not the usual python-modules and thus the behavior of the underlying OS shimmers through. This answer is about Linux, but also other OSes have similar behavior/problems (ok, Windows wouldn't even allow you to rebuild the extension).
A cython-extension is a shared object. When importing, CPython opens this shared object via ldopen
and calls the init-function, i.e. PyInit_<module_name>
in Python3, which among other things registers the functions/functionality provided by the extension.
Is a shared-object loaded, we no longer can unload it, because there might be some Python objects alive, which would then have dangling pointers instead of function-pointers to the functionality from the original shared-object. See for example this CPython-issue.
Another important thing: When ldopen
loads a shared object with the same path as one already loaded shared object, it will not read it from the disc, but just reuse the already loaded version - even if there is a different version on the disc.
And this is the problem with our approach: As long as the resulting shared object has the same name as the old one, you will never get to see the new functionality in the interpreter without restarting it.
What are your options?
A: Use pyximport
with reload_support=True
Let's assume your Cython (foo.pyx
) module looks as follows:
def doit():
print(42)
# called when loaded:
doit()
Now import it with pyximport:
>>> import pyximport
>>> pyximport.install(reload_support=True)
>>> import foo
42
>>> foo.doit()
42
foo.pyx
was built and loaded (we can see, it prints 42 while loading, as expected). Let's take a look at the file of foo
:
>>> foo.__file__
'/home/XXX/.pyxbld/lib.linux-x86_64-3.6/foo.cpython-36m-x86_64-linux-gnu.so.reload1'
You can see the additional reload1
-suffix compared to the case built with reload_support=False
. Seeing the file-name, we also verify that there is no other foo.so
lying in the path somewhere and being wrongly loaded.
Now, let's change 42
to 21
in the foo.pyx
and reload the file:
>>> import importlib
>>> importlib.reload(foo)
21
>>> foo.doit()
42
>>> foo.__file__
'/home/XXX/.pyxbld/lib.linux-x86_64-3.6/foo.cpython-36m-x86_64-linux-gnu.so.reload2'
What happened? pyximport
built an extension with a different prefix (reload2
) and loaded it. It was successful, because the name/path of the new extension is different due to the new prefix and we can see 21
printed while loaded.
However, foo.doit()
is still the old version! If we look up the reload
-documentation, we see:
When reload() is executed:
Python module’s code is recompiled and the module-level code re-executed, defining a new set of objects which are bound to names in the module’s dictionary by reusing the loader which originally loaded the module. The
init
function of extension modules is not called a second time.
init
(i.e. PyInit_<module_name>
) isn't executed for extension (that means also for Cython-extensions), thus PyModuleDef_Init
with foo
-module-definition isn't called and one is stuck with the old definition bound to foo.doit
. This behavior is sane, because for some extension, init
-function isn't supposed to be called twice.
To fix it we have to import the module foo
once again:
>>> import foo
>>> foo.doit()
21
Now foo
is reloaded as good as it gets - which means there might be still old objects being in use. But I trust you to know what you do.
B: Change the name of your extensions with every version
Another strategy could be to build the module foo.pyx
as foo_prefix1.so
and then foo_prefix2.so
and so on and load it as
>>> import foo_perfixX as foo
This is strategy used by %%cython
-magic in IPython, which uses sha1-hash of the Cython-code as prefix.
One can emulate IPython's approach using imp.load_dynamic
(or its implementation with help of importlib
, as imp
is deprecated):
from importlib._bootstrap _load
def load_dynamic(name, path, file=None):
"""
Load an extension module.
"""
import importlib.machinery
loader = importlib.machinery.ExtensionFileLoader(name, path)
# Issue #24748: Skip the sys.modules check in _load_module_shim;
# always load new extension
spec = importlib.machinery.ModuleSpec(
name=name, loader=loader, origin=path)
return _load(spec)
And now putting so-files e.g. into different folders (or adding some suffix), so dlopen
sees them as different from previous version we can use it:
# first argument (name="foo") tells how the init-function
# of the extension (i.e. `PyInit_<module_name>`) is called
foo = load_dynamic("foo", "1/foo.cpython-37m-x86_64-linux-gnu.so")
# now foo has new functionality:
foo = load_dynamic("foo", "2/foo.cpython-37m-x86_64-linux-gnu.so")
Even if reloading and reloading of extension in particular is kind of hacky, for prototyping purposes I would probably go with pyximport
-solution... or use IPython and %%cython
-magic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With