Is there a way to load per-process copies of modules in processes created using Python's multiprocessing module? I tried this:
def my_fn(process_args):
import my_module
my_func()
...but the sub-imports in my_module get loaded and cached once and for all. In particular, one of the sub-imports reads a config file whose values get set based on the environment of the first process. If I try this:
def my_fn(process_args):
try:
my_module = reload(my_module)
except NameError:
import my_module
...the sub-imports of my_module do not get reloaded.
You could try implementing a deep reload function by inspecting the module to reload and reloading any modules it uses. This isn't foolproof, for example won't cope with something like:
class MyClass:
module = import_module('amodule')
but could well be good enough for your purposes.
mymod.py
# Example submodule to re-import
print('import module mymod')
# demonstrate we can even import test as a module and it works
import sys
from test import deep_reload_module
value = 2
def a_function():
pass
class XYZ:
pass
class NewClass(object):
pass
test.py
import importlib
import sys
import mymod
def deep_reload_module(name):
mod = sys.modules.get(name)
if not mod:
importlib.import_module(name)
return
def get_mods_to_reload_recursively(name, modules_to_reload=None):
modules_to_reload = modules_to_reload or set()
mod = sys.modules[name]
modules_to_reload.add(name)
# loop through the attributes in this module and remember any
# submodules we should also reload
for attr in dir(mod):
prop = getattr(mod, attr)
if isinstance(prop, type(mymod)):
modname = attr
elif hasattr(prop, '__module__'):
modname = prop.__module__
if not modname:
continue
else:
# this thing is not a module nor does it come from another
# module, so nothing to reimport.
continue
if modname in sys.builtin_module_names:
# probably best not to reimport built-ins...
continue
if modname in modules_to_reload:
# this is already marked for reimporting, so avoid infinite
# recursion
continue
# get_mods_to_reload... will update modules_to_reload so no need to
# catch the return value
get_mods_to_reload_recursively(modname, modules_to_reload)
return modules_to_reload
mods_to_reload = get_mods_to_reload_recursively(name)
for mtr in mods_to_reload:
# best to delete everything before reloading so that you are
# sure things get re-hooked up properly to the new modules.
print('del sys.modules[%s]' % (mtr,))
del sys.modules[mtr]
importlib.import_module(name)
if __name__ == '__main__':
orig_mymod_id = id(sys.modules['mymod'])
deep_reload_module('mymod')
assert orig_mymod_id != id(sys.modules['mymod'])
Then you just have to call deep_reload_module('module')
whenever a new process starts, or even easier at the beginning of each multiprocessing job.
NB: this relies quite heavily on the code that lives outside the module you want to reimport not previously having imported anything from that module, because if it has then that code will continue to use the old module or break.
E.g. if you've got code that does this:
from module_to_reimport import a_function
But haven't retained module_to_reimport
anywhere explicitly, then a_function
may well fail when it gets called after the module is reimported since it only maintains a weak reference to the globals()
defined in module_to_reimport
and these will all be annihilated by deleting the module from sys.modules
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With