I'm getting the following error when using the multiprocessing module within a python daemon process (using python-daemon):
Traceback (most recent call last): File "/usr/local/lib/python2.6/atexit.py", line 24, in _run_exitfuncs func(*targs, **kargs) File "/usr/local/lib/python2.6/multiprocessing/util.py", line 262, in _exit_function for p in active_children(): File "/usr/local/lib/python2.6/multiprocessing/process.py", line 43, in active_children _cleanup() File "/usr/local/lib/python2.6/multiprocessing/process.py", line 53, in _cleanup if p._popen.poll() is not None: File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 106, in poll pid, sts = os.waitpid(self.pid, flag) OSError: [Errno 10] No child processes
The daemon process (parent) spawns a number of processes (children) and then periodically polls the processes to see if they have completed. If the parent detects that one of the processes has completed, it then attempts to restart that process. It is at this point that the above exception is raised. It seems that once one of the processes completes, any operation involving the multiprocessing module will generate this exception. If I run the identical code in a non-daemon python script, it executes with no errors whatsoever.
EDIT:
Sample script
from daemon import runner
class DaemonApp(object):
def __init__(self, pidfile_path, run):
self.pidfile_path = pidfile_path
self.run = run
self.stdin_path = '/dev/null'
self.stdout_path = '/dev/tty'
self.stderr_path = '/dev/tty'
def run():
import multiprocessing as processing
import time
import os
import sys
import signal
def func():
print 'pid: ', os.getpid()
for i in range(5):
print i
time.sleep(1)
process = processing.Process(target=func)
process.start()
while True:
print 'checking process'
if not process.is_alive():
print 'process dead'
process = processing.Process(target=func)
process.start()
time.sleep(1)
# uncomment to run as daemon
app = DaemonApp('/root/bugtest.pid', run)
daemon_runner = runner.DaemonRunner(app)
daemon_runner.do_action()
#uncomment to run as regular script
#run()
Your problem is a conflict between the daemon and multiprocessing modules, in particular in its handling of the SIGCLD (child process terminated) signal. daemon sets SIGCLD to SIG_IGN when launching, which, at least on Linux, causes terminated children to immediately be reaped (rather than becoming a zombie until the parent invokes wait()). But multiprocessing's is_alive test invokes wait() to see if the process is alive, which fails if the process has already been reaped.
Simplest solution is just to set SIGCLD back to SIG_DFL (default behaviour -- ignore the signal and let the parent wait() for the terminated child process):
def run():
# ...
signal.signal(signal.SIGCLD, signal.SIG_DFL)
process = processing.Process(target=func)
process.start()
while True:
# ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With