I've recently started experimenting with using Python for web development. So far I've had some success using Apache with mod_wsgi and the Django web framework for Python 2.7. However I have run into some issues with having processes constantly running, updating information and such. I have written a script I call "daemonManager.py" that can start and stop all or individual python update loops (Should I call them Daemons?). It does that by forking, then loading the module for the specific functions it should run and starting an infinite loop. It saves a PID file in <code>/var/run</code> to keep track of the process. So far so good. The problems I've encountered are: <ul> <li>Now and then one of the processes will just quit. I check <code>ps</code> in the morning and the process is just gone. No errors were logged (I'm using the <code>logging</code> module), and I'm covering every exception I can think of and logging them. Also I don't think these quitting processes has anything to do with my code, because all my processes run completely different code and exit at pretty similar intervals. I could be wrong of course. Is it normal for Python processes to just die after they've run for days/weeks? How should I tackle this problem? Should I write another daemon that periodically checks if the other daemons are still running? What if that daemon stops? I'm at a loss on how to handle this.</li> <li>How can I programmatically know if a process is still running or not? I'm saving the PID files in <code>/var/run</code> and checking if the PID file is there to determine whether or not the process is running. But if the process just dies of unexpected causes, the PID file will remain. I therefore have to delete these files every time a process crashes (a couple of times per week), which sort of defeats the purpose. I guess I could check if a process is running at the PID in the file, but what if another process has started and was assigned the PID of the dead process? My daemon would think that the process is running fine even if it's long dead. Again I'm at a loss just how to deal with this.</li> </ul> Any useful answer on how to best run infinite Python processes, hopefully also shedding some light on the above problems, I will accept <hr> I'm using Apache 2.2.14 on an Ubuntu machine. My Python version is 2.7.2

I'll open by stating that this is one way to manage a long running process (LRP) -- not de facto by any stretch. In my experience, the best possible product comes from concentrating on the specific problem you're dealing with, while delegating supporting tech to other libraries. In this case, I'm referring to the act of backgrounding processes (the art of the double fork), monitoring, and log redirection. My favorite solution is http://supervisord.org/ Using a system like supervisord, you basically write a conventional python script that performs a task while stuck in an "infinite" loop. <pre class="prettyprint"><code>#!/usr/bin/python import sys import time def main_loop(): while 1: # do your stuff... time.sleep(0.1) if __name__ == '__main__': try: main_loop() except KeyboardInterrupt: print >> sys.stderr, '\nExiting by user request.\n' sys.exit(0) </code></pre> Writing your script this way makes it simple and convenient to develop and debug (you can easily start/stop it in a terminal, watching the log output as events unfold). When it comes time to throw into production, you simply define a supervisor config that calls your script (here's the full example for defining a "program", much of which is optional: http://supervisord.org/configuration.html#program-x-section-example). Supervisor has a bunch of configuration options so I won't enumerate them, but I will say that it specifically solves the problems you describe: <ul> <li>Backgrounding/Daemonizing</li> <li>PID tracking (can be configured to restart a process should it terminate unexpectedly)</li> <li>Log normally in your script (stream handler if using logging module rather than printing) but let supervisor redirect to a file for you.</li> </ul>

You should consider Python processes as able to run "forever" assuming you don't have any memory leaks in your program, the Python interpreter, or any of the Python libraries / modules that you are using. (Even in the face of memory leaks, you might be able to run forever if you have sufficient swap space on a 64-bit machine. Decades, if not centuries, should be doable. I've had Python processes survive just fine for nearly two years on limited hardware -- before the hardware needed to be moved.) Ensuring programs restart when they die used to be very simple back when Linux distributions used SysV-style <code>init</code> -- you just add a new line to the <code>/etc/inittab</code> and <code>init(8)</code> would spawn your program at boot and re-spawn it if it dies. (I know of no mechanism to replicate this functionality with the new <code>upstart</code> <code>init</code>-replacement that many distributions are using these days. I'm not saying it is impossible, I just don't know how to do it.) But even the <code>init(8)</code> mechanism of years gone by wasn't as flexible as some would have liked. The daemontools package by DJB is one example of process control-and-monitoring tools intended to keep daemons living forever. The Linux-HA suite provides another similar tool, though it might provide too much "extra" functionality to be justified for this task. <code>monit</code> is another option.

How do I run long term (infinite) Python processes?

Tags:

python

apache

daemon

infinite-loop

I've recently started experimenting with using Python for web development. So far I've had some success using Apache with mod_wsgi and the Django web framework for Python 2.7. However I have run into some issues with having processes constantly running, updating information and such.

I have written a script I call "daemonManager.py" that can start and stop all or individual python update loops (Should I call them Daemons?). It does that by forking, then loading the module for the specific functions it should run and starting an infinite loop. It saves a PID file in /var/run to keep track of the process. So far so good. The problems I've encountered are:

Now and then one of the processes will just quit. I check ps in the morning and the process is just gone. No errors were logged (I'm using the logging module), and I'm covering every exception I can think of and logging them. Also I don't think these quitting processes has anything to do with my code, because all my processes run completely different code and exit at pretty similar intervals. I could be wrong of course. Is it normal for Python processes to just die after they've run for days/weeks? How should I tackle this problem? Should I write another daemon that periodically checks if the other daemons are still running? What if that daemon stops? I'm at a loss on how to handle this.
How can I programmatically know if a process is still running or not? I'm saving the PID files in /var/run and checking if the PID file is there to determine whether or not the process is running. But if the process just dies of unexpected causes, the PID file will remain. I therefore have to delete these files every time a process crashes (a couple of times per week), which sort of defeats the purpose. I guess I could check if a process is running at the PID in the file, but what if another process has started and was assigned the PID of the dead process? My daemon would think that the process is running fine even if it's long dead. Again I'm at a loss just how to deal with this.

Any useful answer on how to best run infinite Python processes, hopefully also shedding some light on the above problems, I will accept

I'm using Apache 2.2.14 on an Ubuntu machine.
My Python version is 2.7.2

238

asked Dec 31 '11 01:12

Hubro

2 Answers

I'll open by stating that this is one way to manage a long running process (LRP) -- not de facto by any stretch.

In my experience, the best possible product comes from concentrating on the specific problem you're dealing with, while delegating supporting tech to other libraries. In this case, I'm referring to the act of backgrounding processes (the art of the double fork), monitoring, and log redirection.

My favorite solution is http://supervisord.org/

Using a system like supervisord, you basically write a conventional python script that performs a task while stuck in an "infinite" loop.

#!/usr/bin/python

import sys
import time

def main_loop():
    while 1:
        # do your stuff...
        time.sleep(0.1)

if __name__ == '__main__':
    try:
        main_loop()
    except KeyboardInterrupt:
        print >> sys.stderr, '\nExiting by user request.\n'
        sys.exit(0)

Writing your script this way makes it simple and convenient to develop and debug (you can easily start/stop it in a terminal, watching the log output as events unfold). When it comes time to throw into production, you simply define a supervisor config that calls your script (here's the full example for defining a "program", much of which is optional: http://supervisord.org/configuration.html#program-x-section-example).

Supervisor has a bunch of configuration options so I won't enumerate them, but I will say that it specifically solves the problems you describe:

Backgrounding/Daemonizing
PID tracking (can be configured to restart a process should it terminate unexpectedly)
Log normally in your script (stream handler if using logging module rather than printing) but let supervisor redirect to a file for you.

104

answered Oct 23 '22 11:10

Owen Nelson

You should consider Python processes as able to run "forever" assuming you don't have any memory leaks in your program, the Python interpreter, or any of the Python libraries / modules that you are using. (Even in the face of memory leaks, you might be able to run forever if you have sufficient swap space on a 64-bit machine. Decades, if not centuries, should be doable. I've had Python processes survive just fine for nearly two years on limited hardware -- before the hardware needed to be moved.)

Ensuring programs restart when they die used to be very simple back when Linux distributions used SysV-style init -- you just add a new line to the /etc/inittab and init(8) would spawn your program at boot and re-spawn it if it dies. (I know of no mechanism to replicate this functionality with the new upstart init-replacement that many distributions are using these days. I'm not saying it is impossible, I just don't know how to do it.)

But even the init(8) mechanism of years gone by wasn't as flexible as some would have liked. The daemontools package by DJB is one example of process control-and-monitoring tools intended to keep daemons living forever. The Linux-HA suite provides another similar tool, though it might provide too much "extra" functionality to be justified for this task. monit is another option.

answered Oct 23 '22 11:10

sarnold

Related questions
                            
                                Simple, hassle-free, zero-boilerplate serialization in Scala/Java similar to Python's Pickle?
                            
                                Merge on single level of MultiIndex
                            
                                Why do I get an int when I index bytes?
                            
                                Grouping Functions by Using Classes in Python
                            
                                How to detect if the console does support ANSI escape codes in Python?
                            
                                How do I develop against OAuth locally?
                            
                                Pretty print JSON dumps
                            
                                Python 3.4: How to import a module given the full path? [duplicate]
                            
                                Avoid `logger=logging.getLogger(__name__)`
                            
                                Python+Celery: Chaining jobs?
                            
                                Packaging and shipping a python library and scripts, the professional way
                            
                                Error: No module named 'fcntl'
                            
                                Python - Best/Cleanest way to define constant lists or dictionarys
                            
                                Maximum size of pandas dataframe
                            
                                Test Flask render_template() context
                            
                                Filter rows of a numpy array?
                            
                                Why/When in Python does `x==y` call `y.__eq__(x)`?
                            
                                Python `for` syntax: block code vs single line generator expressions
                            
                                What's different between Python and Javascript regular expressions?
                            
                                "Flat is better than nested" - for data as well as code?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With