Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

IPC with a Python subprocess

I'm trying to do some simple IPC in Python as follows: One Python process launches another with subprocess. The child process sends some data into a pipe and the parent process receives it.

Here's my current implementation:

# parent.py
import pickle
import os
import subprocess
import sys
read_fd, write_fd = os.pipe()
if hasattr(os, 'set_inheritable'):
    os.set_inheritable(write_fd, True)
child = subprocess.Popen((sys.executable, 'child.py', str(write_fd)), close_fds=False)
try:
    with os.fdopen(read_fd, 'rb') as reader:
        data = pickle.load(reader)
finally:
    child.wait()
assert data == 'This is the data.'
# child.py
import pickle
import os
import sys
with os.fdopen(int(sys.argv[1]), 'wb') as writer:
    pickle.dump('This is the data.', writer)

On Unix this works as expected, but if I run this code on Windows, I get the following error, after which the program hangs until interrupted:

Traceback (most recent call last):
  File "child.py", line 4, in <module>
    with os.fdopen(int(sys.argv[1]), 'wb') as writer:
  File "C:\Python34\lib\os.py", line 978, in fdopen
    return io.open(fd, *args, **kwargs)
OSError: [Errno 9] Bad file descriptor

I suspect the problem is that the child process isn't inheriting the write_fd file descriptor. How can I fix this?

The code needs to be compatible with Python 2.7, 3.2, and all subsequent versions. This means that the solution can't depend on either the presence or the absence of the changes to file descriptor inheritance specified in PEP 446. As implied above, it also needs to run on both Unix and Windows.

(To answer a couple of obvious questions: The reason I'm not using multiprocessing is because, in my real-life non-simplified code, the two Python programs are part of Django projects with different settings modules. This means they can't share any global state. Also, the child process's standard streams are being used for other purposes and are not available for this.)

UPDATE: After setting the close_fds parameter, the code now works in all versions of Python on Unix. However, it still fails on Windows.

like image 558
Taymon Avatar asked Feb 15 '15 23:02

Taymon


2 Answers

subprocess.PIPE is implemented for all platforms. Why don't you just use this?

If you want to manually create and use an os.pipe(), you need to take care of the fact that Windows does not support fork(). It rather uses CreateProcess() which by default not make the child inherit open files. But there is a way: each single file descriptor can be made explicitly inheritable. This requires calling Win API. I have implemented this in gipc, see the _pre/post_createprocess_windows() methods here.

like image 101
Dr. Jan-Philip Gehrcke Avatar answered Sep 22 '22 20:09

Dr. Jan-Philip Gehrcke


As @Jan-Philip Gehrcke suggested, you could use subprocess.PIPE instead of os.pipe():

#!/usr/bin/env python
# parent.py
import sys
from subprocess import check_output

data = check_output([sys.executable or 'python', 'child.py'])
assert data.decode().strip() == 'This is the data.'

check_output() uses stdout=subprocess.PIPE internally.

You could use obj = pickle.loads(data) if child.py uses data = pickle.dumps(obj).

And the child.py could be simplified:

#!/usr/bin/env python
# child.py
print('This is the data.')

If the child process is written in Python then for greater flexibility you could import the child script as a module and call its function instead of using subprocess. You could use multiprocessing, concurrent.futures modules if you need to run some Python code in a different process.

If you can't use standard streams then your django applications could use sockets to talk to one another.

The reason I'm not using multiprocessing is because, in my real-life non-simplified code, the two Python programs are part of Django projects with different settings modules. This means they can't share any global state.

This seems bogus. multiprocessing under-the-hood also may use subprocess module. If you don't want to share global state -- don't share it -- it is the default for multiple processes. You should probably ask a more specific for your particular case question about how to organize the communication between various parts of your project.

like image 30
jfs Avatar answered Sep 22 '22 20:09

jfs