I have a Python program (precisely, a Django application) that starts a subprocess using subprocess.Popen
. Due to architecture constraints of my application, I'm not able to use Popen.terminate()
to terminate the subprocess and Popen.poll()
to check when the process has terminated. This is because I cannot hold a reference to the started subprocess in a variable.
Instead, I have to write the process id pid
to a file pidfile
when the subprocess starts. When I want to stop the subprocess, I open this pidfile
and use os.kill(pid, signal.SIGTERM)
to stop it.
My question is: How can I find out when the subprocess has really terminated? Using signal.SIGTERM
it needs approximately 1-2 minutes to finally terminate after calling os.kill()
. First I thought that os.waitpid()
would be the right thing for this task but when I call it after os.kill()
it gives me OSError: [Errno 10] No child processes
.
By the way, I'm starting and stopping the subprocess from a HTML template using two forms and the program logic is inside a Django view. The exception gets displayed in my browser when my application is in debug mode. It's probably also important to know that the subprocess that I call in my view (python manage.py crawlwebpages
) itself calls another subprocess, namely an instance of a Scrapy crawler. I write the pid
of this Scrapy instance to the pidfile
and this is what I want to terminate.
Here is the relevant code:
def process_main_page_forms(request):
if request.method == 'POST':
if request.POST['form-type'] == u'webpage-crawler-form':
template_context = _crawl_webpage(request)
elif request.POST['form-type'] == u'stop-crawler-form':
template_context = _stop_crawler(request)
else:
template_context = {
'webpage_crawler_form': WebPageCrawlerForm(),
'stop_crawler_form': StopCrawlerForm()}
return render(request, 'main.html', template_context)
def _crawl_webpage(request):
webpage_crawler_form = WebPageCrawlerForm(request.POST)
if webpage_crawler_form.is_valid():
url_to_crawl = webpage_crawler_form.cleaned_data['url_to_crawl']
maximum_pages_to_crawl = webpage_crawler_form.cleaned_data['maximum_pages_to_crawl']
program = 'python manage.py crawlwebpages' + ' -n ' + str(maximum_pages_to_crawl) + ' ' + url_to_crawl
p = subprocess.Popen(program.split())
template_context = {
'webpage_crawler_form': webpage_crawler_form,
'stop_crawler_form': StopCrawlerForm()}
return template_context
def _stop_crawler(request):
stop_crawler_form = StopCrawlerForm(request.POST)
if stop_crawler_form.is_valid():
with open('scrapy_crawler_process.pid', 'rb') as pidfile:
process_id = int(pidfile.read().strip())
print 'PROCESS ID:', process_id
os.kill(process_id, signal.SIGTERM)
os.waitpid(process_id, os.WNOHANG) # This gives me the OSError
print 'Crawler process terminated!'
template_context = {
'webpage_crawler_form': WebPageCrawlerForm(),
'stop_crawler_form': stop_crawler_form}
return template_context
What can I do? Thank you very much!
EDIT:
According to the great answer given by Jacek Konieczny, I could solve my problem by changing my code in the function _stop_crawler(request)
to the following:
def _stop_crawler(request):
stop_crawler_form = StopCrawlerForm(request.POST)
if stop_crawler_form.is_valid():
with open('scrapy_crawler_process.pid', 'rb') as pidfile:
process_id = int(pidfile.read().strip())
# These are the essential lines
os.kill(process_id, signal.SIGTERM)
while True:
try:
time.sleep(10)
os.kill(process_id, 0)
except OSError:
break
print 'Crawler process terminated!'
template_context = {
'webpage_crawler_form': WebPageCrawlerForm(),
'stop_crawler_form': stop_crawler_form}
return template_context
The usual way to check if a process is still running is to kill() it with signal '0'. It does nothing to a running job and raises an OSError
exception with errno=ESRCH
if the process does not exist.
[jajcus@lolek ~]$ sleep 1000 &
[1] 2405
[jajcus@lolek ~]$ python
Python 2.7.3 (default, May 11 2012, 11:57:22)
[GCC 4.6.3 20120315 (release)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.kill(2405, 0)
>>> os.kill(2405, 15)
>>> os.kill(2405, 0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 3] No such process
But whenever possible the caller should stay a parent of the called process and use wait()
function family to handle its termination. That is what Popen
object does.
My solution would be to put an intermediate process which controls subprocessing.
So your web requests (which all seem to happen in different processes - due to parallelization?) tell the control process to launch a given program and watch it; as soon as needed they ask what the status is.
This process would, in the simplest case, be a process which opens a UNIX domain socket (a TCP/IP socket would do as well) and listen to it. The "web process" connects to it, sends the launch request and gets back a unique ID. Afterwards, it can use this ID to make further queries on the new process.
Alternatively, it gives the ID by itself (or it uses not ID at all, if there only can be one process) and so doesn't have to keep some variable ID around.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With