I have a function that uses multiprocessing (specifically joblib) to speed up a slow routine using multiple cores. It works great; no questions there.
I have a test suite that uses multiprocessing (currently just the multiprocessing.Pool() system, but can change it to joblib) to run each module's test functions independently. It works great; no questions there.
The problem is that I've now integrated the multiprocessing function into the module's test suite, so that the pool process runs the multiprocessing function. I would like to make it so that the inner function knows that it is already being multiprocessed and not spin up more forks of itself. Currently the inner process sometimes hangs, but even if it doesn't, obviously there are no gains to multiprocessing within an already parallel routine.
I can think of several ways (with lock files, setting some sort of global variable, etc.) to determine the state we're in, but I'm wondering if there is some standard way of figuring this out (either in PY multiprocessing or in joblib). If it only works in PY3, that'd be fine, though obviously solutions that also work on 2.7 or lower would be better. Thanks!
Parallel in joblib should be able to sort these things out:
http://pydoc.net/Python/joblib/0.8.3-r1/joblib.parallel/
Two pieces from 0.8.3-r1:
# Set an environment variable to avoid infinite loops
os.environ[JOBLIB_SPAWNED_PROCESS] = '1'
Don't know why they go from a variable referring to the environmental, to the env. itself.. But as you can see. The feature is already implemented in joblib.
# We can now allow subprocesses again
os.environ.pop('__JOBLIB_SPAWNED_PARALLEL__', 0)
Here you can select other versions, if that's more relevant:
http://pydoc.net/Python/joblib/0.8.3-r1/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With