Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spawn multiprocessing.Process under different python executable with own path

I have two versions of Python (these are actually two conda environments)

/path/to/bin-1/python
/path/to/bin-2/python

From one version of python I want to launch a function that runs in the other version using something like the multiprocessing.Process object. It turns out that this is doable using the set_executable method:

ctx = multiprocess.get_context('spawn')
ctx.set_executable('/path/to/bin-2/python')

And indeed we can see that this does in fact launch using that executable:

def f(q):
    import sys
    q.put(sys.executable)

if __name__ == '__main__':
    import multiprocessing
    ctx = multiprocessing.get_context('spawn')
    ctx.set_executable('/path/to/bin-2/python')
    q = ctx.Queue()
    proc = ctx.Process(target=f, args=(q,))
    proc.start()
    print(q.get())

$ python foo.py
/path/to/bin-2/python

However Path is Wrong

However when I do the same thing with sys.path rather than sys.executable I find that the sys.path for the hosting python process is printed out instead, rather than the sys.path I would find from running /path/to/bin-2/python -c "import sys; print(sys.path)" directly.

I'm used to this sort of thing if I use fork. I would have expected 'spawn' to act the same as though I had entered the python interpreter from the shell.

Question

Is it possible to use the multiprocessing library to run functions and use Queues from another Python executable with the environment that it would have had had I started it from the shell?

More broadly, how does sys.path get populated and what is different between using multiprocessing in this way and launching the interpreter directly?

like image 328
MRocklin Avatar asked Sep 07 '16 14:09

MRocklin


People also ask

How do I share data between two processes in Python?

Passing Messages to Processes A simple way to communicate between process with multiprocessing is to use a Queue to pass messages back and forth. Any pickle-able object can pass through a Queue. This short example only passes a single message to a single worker, then the main process waits for the worker to finish.

What is a Daemonic process Python?

Daemon processes in Python Python multiprocessing module allows us to have daemon processes through its daemonic option. Daemon processes or the processes that are running in the background follow similar concept as the daemon threads. To execute the process in the background, we need to set the daemonic flag to true.


1 Answers

I ran into the same problem. My system wide Python executable is at /path/to/bin-1/python, and I created a virtual environment using virtualenv containing another Python executable at /path/to/bin-2/python. To set up the right path / environment for the spawned process needed for /path/to/bin-2/python, I ended up copying the code from activate_this.py in the virtualenv folder to f(q).

def f(q):
    import sys, os

    def active_virtualenv(exec_path):
        """
        copy virtualenv's activate_this.py
        exec_path: the python.exe path from sys.executable
        """
        # set env. var. PATH
        old_os_path = os.environ.get('PATH', '')
        os.environ['PATH'] = os.path.dirname(os.path.abspath(exec_path)) + os.pathsep + old_os_path
        base = os.path.dirname(os.path.dirname(os.path.abspath(exec_path)))
        # site-pachages path
        if sys.platform == 'win32':
            site_packages = os.path.join(base, 'Lib', 'site-packages')
        else:
            site_packages = os.path.join(base, 'lib', 'python%s' % sys.version[:3], 'site-packages')
        # modify sys.path
        prev_sys_path = list(sys.path)
        import site
        site.addsitedir(site_packages)
        sys.real_prefix = sys.prefix
        sys.prefix = base
        # Move the added items to the front of the path:
        new_sys_path = []
        for item in list(sys.path):
            if item not in prev_sys_path:
                new_sys_path.append(item)
                sys.path.remove(item)
        sys.path[:0] = new_sys_path
        return None

    active_virtualenv(sys.executable)
    q.put(sys.executable)
    # check some unique package in this env.
    import special_package
    print "package version: {}".format(special_package.__version__)


if __name__ == '__main__':
    import multiprocessing
    multiprocessing.set_executable('/path/to/bin-2/python')
    q = multiprocessing.Queue()
    proc = multiprocessing.Process(target=f, args=(q,))
    proc.start()
    proc.join()
    print(q.get())

stdouts:

$ python foo.py
/path/to/bin-2/python
package version: unique_version_only_in_virtualenv

One thing I'm not so certain is sys and os are imported before active_virtualenv(), which means they are from system wide Python env. But other packages I need in f(q) are from virtual env. Maybe it's worth re-import them after switching env.

like image 143
graffaner Avatar answered Oct 23 '22 05:10

graffaner