Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python multiprocessing and independence of children processes

From the python terminal, I run some command like the following, to spawn a long-running child process:

from multiprocessing.process import Process
Process(target=LONG_RUNNING_FUNCTION).start()

This command returns, and I can do other things in the python terminal, but anything printed by the child is still printed to my python terminal session.

When I exit the terminal (either with exit or CTRL+D), the exit command it hangs. If I hit CTRL+C during this hang, the child process is terminated.

If I kill the python terminal process manually (via the posix kill command), the child process is instead orphaned, and continues running with its output presumably discarded.

If I run this code with python -c, it waits for the child to terminate, and CTRL+C kills both parent and child.

Which run configurations of python kill children when the parents are terminated? In particular, if a python-mod_wsgi-apache webserver spawns child processes and then is restarted, are the children killed?

[ As an aside, what is the proper way of detaching child processes spawned from the terminal? Is there a way more elegant than the following: Deliberately make an orphan process in python ]

Update: python subprocesses spawned with multiprocessing.Process by a web server running under apache are not killed when apache is restarted.

like image 552
Zags Avatar asked Feb 09 '14 21:02

Zags


1 Answers

This isn't a matter of how you're invoking python; it's a feature of the multiprocessing module. When you import that module an exit handler is added to the parent process that calls join() on the Process objects of all children created via multiprocessing.Process before allowing the parent to exit. If you're going to start child processes in this fashion, there's no way, without hacking the module internals, to avoid the behavior that's giving you trouble.

If you want to start a process that's able to outlive the parent, then you'll probably be better served using subprocess.Popen. If the child's started that way, the parent won't attempt to join the child before exiting and will instead exit immediately, leaving an orphaned child:

>>> from subprocess import Popen
>>> Popen(["sleep", "100"], start_new_session=True)
<subprocess.Popen object at 0x10d3fedd0>
>>> exit()
alp:~ $ ps -opid,ppid,command | grep sleep | grep -v grep
37979     1 sleep 100

Is there a particular reason you're using multiprocessing instead of subprocess? The former isn't intended to be used to create child processes meant to outlive the parent; it's for creating child processes to do work that can be usefully parallelized across CPUs, as a way of circumventing the Global Interpreter Lock. (I'm ignoring the distributed capabilities of multiprocessing for purposes of this discussion.) multiprocessing is thus normally used in those cases where, if there weren't a GIL, you would use threads. (Note, in this regard, that the multiprocessing module's API is modeled closely after that of the threading module.)

To the specific questions at the end of your post: (1) Nothing about python is responsible for killing children when the parent is terminated. A child of the web server will only be killed if the parent kills it before exiting (or if the entire process group is killed). (2) The method you link to looks like it's trying to replicate daemonization without knowledge of the standard idioms for doing that. There are a number of packages out there for creating daemon processes; you should use one of them instead.

like image 157
Alp Avatar answered Oct 11 '22 18:10

Alp