Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

reload module with pyximport?

Tags:

python

cython

I have a python program that loads quite a bit of data before running. As such, I'd like to be able to reload code without reloading data. With regular python, importlib.reload has been working fine. Here's an example:

setup.py:

from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize

extensions = [
    Extension("foo.bar", ["foo/bar.pyx"],
              language="c++",
              extra_compile_args=["-std=c++11"],
              extra_link_args=["-std=c++11"])
]
setup(
    name="system2",
    ext_modules=cythonize(extensions, compiler_directives={'language_level' : "3"}),
)

foo/bar.py

cpdef say_hello():
    print('Hello!')

runner.py:

import pyximport
pyximport.install(reload_support=True)

import foo.bar
import subprocess
from importlib import reload

if __name__ == '__main__':

    def reload_bar():
        p = subprocess.Popen('python setup.py build_ext --inplace',
                             shell=True,
                             cwd='<your directory>')
        p.wait()

        reload(foo.bar)
        foo.bar.say_hello()

But this doesn't seem to work. If I edit bar.pyx and run reload_bar I don't see my changes. I also tried pyximport.build_module() with no luck -- the module rebuilt but didn't reload. I'm running in a "normal" python shell, not IPython if it makes a difference.

like image 791
slushi Avatar asked Mar 08 '19 04:03

slushi


2 Answers

I was able to get a solution working for Python 2.x a lot easier than Python 3.x. For whatever reason, Cython seems to be caching the shareable object (.so) file it imports your module from, and even after rebuilding and deleting the old file while running, it still imports from the old shareable object file. However, this isn't necessary anyways (when you import foo.bar, it doesn't create one), so we can just skip this anyways.

The largest problem was that python kept a reference to the old module, even after reloading. Normal python modules seem to work find, but not anything cython related. To fix this, I run execute two statements in place of reload(foo.bar)

del sys.modules['foo.bar']
import foo.bar

This successfully (though probably less efficiently) reloads the cython module. The only issue that remains in in Python 3.x running that subprocess creates a problematic shareable objects. Instead, skip that all together and let the import foo.bar work its magic with the pyximporter module, and recompile for you. I also added an option to the pyxinstall command to specify the language level to match what you've specified in the setup.py

pyximport.install(reload_support=True, language_level=3)

So all together:

runner.py

import sys
import pyximport
pyximport.install(reload_support=True, language_level=3)

import foo.bar

if __name__ == '__main__':
    def reload_bar():
        del sys.modules['foo.bar']
        import foo.bar

    foo.bar.say_hello()
    input("  press enter to proceed  ")
    reload_bar()
    foo.bar.say_hello()

Other two files remained unchanged

Running:

Hello!
  press enter to proceed

-replace "Hello!" in foo/bar.pyx with "Hello world!", and press Enter.

Hello world!
like image 190
Dillon Davis Avatar answered Nov 20 '22 12:11

Dillon Davis


Cython-extensions are not the usual python-modules and thus the behavior of the underlying OS shimmers through. This answer is about Linux, but also other OSes have similar behavior/problems (ok, Windows wouldn't even allow you to rebuild the extension).

A cython-extension is a shared object. When importing, CPython opens this shared object via ldopen and calls the init-function, i.e. PyInit_<module_name> in Python3, which among other things registers the functions/functionality provided by the extension.

Is a shared-object loaded, we no longer can unload it, because there might be some Python objects alive, which would then have dangling pointers instead of function-pointers to the functionality from the original shared-object. See for example this CPython-issue.

Another important thing: When ldopen loads a shared object with the same path as one already loaded shared object, it will not read it from the disc, but just reuse the already loaded version - even if there is a different version on the disc.

And this is the problem with our approach: As long as the resulting shared object has the same name as the old one, you will never get to see the new functionality in the interpreter without restarting it.

What are your options?

A: Use pyximport with reload_support=True

Let's assume your Cython (foo.pyx) module looks as follows:

def doit(): 
    print(42)
# called when loaded:
doit()

Now import it with pyximport:

>>> import pyximport
>>> pyximport.install(reload_support=True)
>>> import foo
42
>>> foo.doit()
42

foo.pyx was built and loaded (we can see, it prints 42 while loading, as expected). Let's take a look at the file of foo:

>>> foo.__file__
'/home/XXX/.pyxbld/lib.linux-x86_64-3.6/foo.cpython-36m-x86_64-linux-gnu.so.reload1'

You can see the additional reload1-suffix compared to the case built with reload_support=False. Seeing the file-name, we also verify that there is no other foo.so lying in the path somewhere and being wrongly loaded.

Now, let's change 42 to 21 in the foo.pyx and reload the file:

>>> import importlib
>>> importlib.reload(foo)
21
>>> foo.doit()
42
>>> foo.__file__
'/home/XXX/.pyxbld/lib.linux-x86_64-3.6/foo.cpython-36m-x86_64-linux-gnu.so.reload2'

What happened? pyximport built an extension with a different prefix (reload2) and loaded it. It was successful, because the name/path of the new extension is different due to the new prefix and we can see 21 printed while loaded.

However, foo.doit() is still the old version! If we look up the reload-documentation, we see:

When reload() is executed:

Python module’s code is recompiled and the module-level code re-executed, defining a new set of objects which are bound to names in the module’s dictionary by reusing the loader which originally loaded the module. The init function of extension modules is not called a second time.

init (i.e. PyInit_<module_name>) isn't executed for extension (that means also for Cython-extensions), thus PyModuleDef_Init with foo-module-definition isn't called and one is stuck with the old definition bound to foo.doit. This behavior is sane, because for some extension, init-function isn't supposed to be called twice.

To fix it we have to import the module foo once again:

>>> import foo
>>> foo.doit()
21

Now foo is reloaded as good as it gets - which means there might be still old objects being in use. But I trust you to know what you do.

B: Change the name of your extensions with every version

Another strategy could be to build the module foo.pyx as foo_prefix1.so and then foo_prefix2.so and so on and load it as

>>> import foo_perfixX as foo

This is strategy used by %%cython-magic in IPython, which uses sha1-hash of the Cython-code as prefix.

One can emulate IPython's approach using imp.load_dynamic (or its implementation with help of importlib, as imp is deprecated):

from importlib._bootstrap _load
def load_dynamic(name, path, file=None):
    """
    Load an extension module.
    """
    import importlib.machinery
    loader = importlib.machinery.ExtensionFileLoader(name, path)

    # Issue #24748: Skip the sys.modules check in _load_module_shim;
    # always load new extension
    spec = importlib.machinery.ModuleSpec(
        name=name, loader=loader, origin=path)
    return _load(spec)

And now putting so-files e.g. into different folders (or adding some suffix), so dlopen sees them as different from previous version we can use it:

# first argument (name="foo") tells how the init-function 
# of the extension (i.e. `PyInit_<module_name>`) is called 
foo =  load_dynamic("foo", "1/foo.cpython-37m-x86_64-linux-gnu.so")
# now foo has new functionality:
foo = load_dynamic("foo", "2/foo.cpython-37m-x86_64-linux-gnu.so")

Even if reloading and reloading of extension in particular is kind of hacky, for prototyping purposes I would probably go with pyximport-solution... or use IPython and %%cython-magic.

like image 3
ead Avatar answered Nov 20 '22 13:11

ead