Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find out when subprocess has terminated after using os.kill()?

I have a Python program (precisely, a Django application) that starts a subprocess using subprocess.Popen. Due to architecture constraints of my application, I'm not able to use Popen.terminate() to terminate the subprocess and Popen.poll() to check when the process has terminated. This is because I cannot hold a reference to the started subprocess in a variable.

Instead, I have to write the process id pid to a file pidfile when the subprocess starts. When I want to stop the subprocess, I open this pidfile and use os.kill(pid, signal.SIGTERM) to stop it.

My question is: How can I find out when the subprocess has really terminated? Using signal.SIGTERM it needs approximately 1-2 minutes to finally terminate after calling os.kill(). First I thought that os.waitpid() would be the right thing for this task but when I call it after os.kill() it gives me OSError: [Errno 10] No child processes.

By the way, I'm starting and stopping the subprocess from a HTML template using two forms and the program logic is inside a Django view. The exception gets displayed in my browser when my application is in debug mode. It's probably also important to know that the subprocess that I call in my view (python manage.py crawlwebpages) itself calls another subprocess, namely an instance of a Scrapy crawler. I write the pid of this Scrapy instance to the pidfile and this is what I want to terminate.

Here is the relevant code:

def process_main_page_forms(request):
    if request.method == 'POST':
        if request.POST['form-type'] == u'webpage-crawler-form':
            template_context = _crawl_webpage(request)

        elif request.POST['form-type'] == u'stop-crawler-form':
            template_context = _stop_crawler(request)
    else:
        template_context = {
            'webpage_crawler_form': WebPageCrawlerForm(),
            'stop_crawler_form': StopCrawlerForm()}

    return render(request, 'main.html', template_context)

def _crawl_webpage(request):
    webpage_crawler_form = WebPageCrawlerForm(request.POST)

    if webpage_crawler_form.is_valid():
        url_to_crawl = webpage_crawler_form.cleaned_data['url_to_crawl']
        maximum_pages_to_crawl = webpage_crawler_form.cleaned_data['maximum_pages_to_crawl']

        program = 'python manage.py crawlwebpages' + ' -n ' + str(maximum_pages_to_crawl) + ' ' + url_to_crawl
        p = subprocess.Popen(program.split())

    template_context = {
        'webpage_crawler_form': webpage_crawler_form,
        'stop_crawler_form': StopCrawlerForm()}

    return template_context

def _stop_crawler(request):
    stop_crawler_form = StopCrawlerForm(request.POST)

    if stop_crawler_form.is_valid():
        with open('scrapy_crawler_process.pid', 'rb') as pidfile:
            process_id = int(pidfile.read().strip())
            print 'PROCESS ID:', process_id

        os.kill(process_id, signal.SIGTERM)
        os.waitpid(process_id, os.WNOHANG) # This gives me the OSError
        print 'Crawler process terminated!'

    template_context = {
        'webpage_crawler_form': WebPageCrawlerForm(),
        'stop_crawler_form': stop_crawler_form}

    return template_context

What can I do? Thank you very much!

EDIT:

According to the great answer given by Jacek Konieczny, I could solve my problem by changing my code in the function _stop_crawler(request) to the following:

def _stop_crawler(request):
    stop_crawler_form = StopCrawlerForm(request.POST)

    if stop_crawler_form.is_valid():
        with open('scrapy_crawler_process.pid', 'rb') as pidfile:
            process_id = int(pidfile.read().strip())

        # These are the essential lines
        os.kill(process_id, signal.SIGTERM)
        while True:
            try:
                time.sleep(10)
                os.kill(process_id, 0)
            except OSError:
                break
        print 'Crawler process terminated!'

    template_context = {
        'webpage_crawler_form': WebPageCrawlerForm(),
        'stop_crawler_form': stop_crawler_form}

    return template_context
like image 742
pemistahl Avatar asked Nov 15 '12 14:11

pemistahl


2 Answers

The usual way to check if a process is still running is to kill() it with signal '0'. It does nothing to a running job and raises an OSError exception with errno=ESRCH if the process does not exist.

[jajcus@lolek ~]$ sleep 1000 &
[1] 2405
[jajcus@lolek ~]$ python
Python 2.7.3 (default, May 11 2012, 11:57:22) 
[GCC 4.6.3 20120315 (release)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.kill(2405, 0)
>>> os.kill(2405, 15)
>>> os.kill(2405, 0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 3] No such process

But whenever possible the caller should stay a parent of the called process and use wait() function family to handle its termination. That is what Popen object does.

like image 62
Jacek Konieczny Avatar answered Oct 27 '22 00:10

Jacek Konieczny


My solution would be to put an intermediate process which controls subprocessing.

So your web requests (which all seem to happen in different processes - due to parallelization?) tell the control process to launch a given program and watch it; as soon as needed they ask what the status is.

This process would, in the simplest case, be a process which opens a UNIX domain socket (a TCP/IP socket would do as well) and listen to it. The "web process" connects to it, sends the launch request and gets back a unique ID. Afterwards, it can use this ID to make further queries on the new process.

Alternatively, it gives the ID by itself (or it uses not ID at all, if there only can be one process) and so doesn't have to keep some variable ID around.

like image 42
glglgl Avatar answered Oct 26 '22 23:10

glglgl