Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python subprocess.Popen erroring with OSError: [Errno 12] Cannot allocate memory after period of time

Note: This question has been re-asked with a summary of all debugging attempts here.


I have a Python script that is running as a background process executing every 60 seconds. Part of that is a call to subprocess.Popen to get the output of ps.

ps = subprocess.Popen(['ps', 'aux'], stdout=subprocess.PIPE).communicate()[0]

After running for a few days, the call is erroring with:

File "/home/admin/sd-agent/checks.py", line 436, in getProcesses
File "/usr/lib/python2.4/subprocess.py", line 533, in __init__
File "/usr/lib/python2.4/subprocess.py", line 835, in _get_handles
OSError: [Errno 12] Cannot allocate memory

However the output of free on the server is:

$ free -m
                  total       used       free     shared     buffers    cached
Mem:                894        345        549          0          0          0
-/+ buffers/cache:  345        549
Swap:                 0          0          0

I have searched around for the problem and found this article which says:

Solution is to add more swap space to your server. When the kernel is forking to start the modeler or discovery process, it first ensures there's enough space available on the swap store the new process if needed.

I note that there is no available swap from the free output above. Is this likely to be the problem and/or what other solutions might there be?

Update 13th Aug 09 The code above is called every 60 seconds as part of a series of monitoring functions. The process is daemonized and the check is scheduled using sched. The specific code for the above function is:

def getProcesses(self):
    self.checksLogger.debug('getProcesses: start')

    # Memory logging (case 27152)
    if self.agentConfig['debugMode'] and sys.platform == 'linux2':
        mem = subprocess.Popen(['free', '-m'], stdout=subprocess.PIPE).communicate()[0]
        self.checksLogger.debug('getProcesses: memory before Popen - ' + str(mem))

    # Get output from ps
    try:
        self.checksLogger.debug('getProcesses: attempting Popen')

        ps = subprocess.Popen(['ps', 'aux'], stdout=subprocess.PIPE).communicate()[0]

    except Exception, e:
        import traceback
        self.checksLogger.error('getProcesses: exception = ' + traceback.format_exc())
        return False

    self.checksLogger.debug('getProcesses: Popen success, parsing')

    # Memory logging (case 27152)
    if self.agentConfig['debugMode'] and sys.platform == 'linux2':
        mem = subprocess.Popen(['free', '-m'], stdout=subprocess.PIPE).communicate()[0]
        self.checksLogger.debug('getProcesses: memory after Popen - ' + str(mem))

    # Split out each process
    processLines = ps.split('\n')

    del processLines[0] # Removes the headers
    processLines.pop() # Removes a trailing empty line

    processes = []

    self.checksLogger.debug('getProcesses: Popen success, parsing, looping')

    for line in processLines:
        line = line.split(None, 10)
        processes.append(line)

    self.checksLogger.debug('getProcesses: completed, returning')

    return processes

This is part of a bigger class called checks which is initialised once when the daemon is started.

The entire checks class can be found at http://github.com/dmytton/sd-agent/blob/82f5ff9203e54d2adeee8cfed704d09e3f00e8eb/checks.py with the getProcesses function defined from line 442. This is called by doChecks() starting at line 520.

like image 201
davidmytton Avatar asked Aug 01 '09 15:08

davidmytton


2 Answers

You've perhaps got a memory leak bounded by some resource limit (RLIMIT_DATA, RLIMIT_AS?) inherited by your python script. Check your *ulimit(1)*s before you run your script, and profile the script's memory usage, as others have suggested.

What do you do with the variable ps after the code snippet you show us? Do you keep a reference to it, never to be freed? Quoting the subprocess module docs:

Note: The data read is buffered in memory, so do not use this method if the data size is large or unlimited.

... and ps aux can be verbose on a busy system...

Update

You can check rlimits from with your python script using the resource module:

import resource
print resource.getrlimit(resource.RLIMIT_DATA) # => (soft_lim, hard_lim)
print resource.getrlimit(resource.RLIMIT_AS)

If these return "unlimited" -- (-1, -1) -- then my hypothesis is incorrect and you may move on!

See also resource.getrusage, esp. the ru_??rss fields, which can help you to instrument for memory consumption from with the python script, without shelling out to an external program.

like image 182
pilcrow Avatar answered Sep 28 '22 01:09

pilcrow


when you use popen you need to hand in close_fds=True if you want it to close extra file descriptors.

creating a new pipe, which occurs in the _get_handles function from the back trace, creates 2 file descriptors, but your current code never closes them and your eventually hitting your systems max fd limit.

Not sure why the error you're getting indicates an out of memory condition: it should be a file descriptor error as the return value of pipe() has an error code for this problem.

like image 23
Mark Avatar answered Sep 28 '22 02:09

Mark