I have a memory intensive Python application (between hundreds of MB to several GB). I have a couple of VERY SMALL Linux executables the main application needs to run, e.g. <pre class="prettyprint"><code>child = Popen("make html", cwd = r'../../docs', stdout = PIPE, shell = True) child.wait() </code></pre> When I run these external utilities (once, at the end of the long main process run) using <code>subprocess.Popen</code> I sometimes get <code>OSError: [Errno 12] Cannot allocate memory</code>. I don't understand why... The requested process is tiny! The system has enough memory for many more shells. I'm using Linux (Ubuntu 12.10, 64 bits), so I guess subprocess calls Fork. And Fork forks my existing process, thus doubling the amount of memory consumed, and fails?? What happened to "copy on write"? Can I spawn a new process without fork (or at least without copying memory - starting fresh)? Related: The difference between fork(), vfork(), exec() and clone() fork () & memory allocation behavior Python subprocess.Popen erroring with OSError: [Errno 12] Cannot allocate memory after period of time Python memory allocation error using subprocess.Popen

It doesn't appear that a real solution will be forthcoming (i.e. an alternate implementation of subprocess that uses vfork). So how about a cute hack? At the beginning of your process, spawn a slave that hangs around with a small memory footprint, ready to spawn your subprocesses, and keep open communication to it throughout the life of the main process. Here's an example using rfoo (http://code.google.com/p/rfoo/) with a named unix socket called rfoosocket (you could obviously use other connection types rfoo supports, or another RPC library): Server: <pre class="prettyprint"><code>import rfoo import subprocess class MyHandler(rfoo.BaseHandler): def RPopen(self, cmd): c = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True) c.wait() return c.stdout.read() rfoo.UnixServer(MyHandler).start('rfoosocket') </code></pre> Client: <pre class="prettyprint"><code>import rfoo # Waste a bunch of memory before spawning the child. Swap out the RPC below # for a straight popen to show it otherwise fails. Tweak to suit your # available system memory. mem = [x for x in range(100000000)] c = rfoo.UnixConnection().connect('rfoosocket') print rfoo.Proxy(c).RPopen('ls -l') </code></pre> If you need real-time back and forth coprocess interaction with your spawned subprocesses this model probably won't work, but you might be able to hack it in. You'll presumably want to clean up the available args that can be passed to Popen based on your specific needs, but that should all be relatively straightforward. You should also find it straightforward to launch the server at the start of the client, and to manage the socket file (or port) to be cleaned up on exit.

Understanding Python fork and memory allocation errors

Tags:

python

linux

fork

I have a memory intensive Python application (between hundreds of MB to several GB).
I have a couple of VERY SMALL Linux executables the main application needs to run, e.g.

child = Popen("make html", cwd = r'../../docs', stdout = PIPE, shell = True)
child.wait()

When I run these external utilities (once, at the end of the long main process run) using subprocess.Popen I sometimes get OSError: [Errno 12] Cannot allocate memory.
I don't understand why... The requested process is tiny!
The system has enough memory for many more shells.

I'm using Linux (Ubuntu 12.10, 64 bits), so I guess subprocess calls Fork.
And Fork forks my existing process, thus doubling the amount of memory consumed, and fails??
What happened to "copy on write"?

Can I spawn a new process without fork (or at least without copying memory - starting fresh)?

The difference between fork(), vfork(), exec() and clone()

fork () & memory allocation behavior

Python subprocess.Popen erroring with OSError: [Errno 12] Cannot allocate memory after period of time

Python memory allocation error using subprocess.Popen

318

asked Mar 15 '13 22:03

Tal Weiss

1 Answers

It doesn't appear that a real solution will be forthcoming (i.e. an alternate implementation of subprocess that uses vfork). So how about a cute hack? At the beginning of your process, spawn a slave that hangs around with a small memory footprint, ready to spawn your subprocesses, and keep open communication to it throughout the life of the main process.

Here's an example using rfoo (http://code.google.com/p/rfoo/) with a named unix socket called rfoosocket (you could obviously use other connection types rfoo supports, or another RPC library):

Server:

import rfoo
import subprocess

class MyHandler(rfoo.BaseHandler):
    def RPopen(self, cmd):
        c = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
        c.wait()
        return c.stdout.read()

rfoo.UnixServer(MyHandler).start('rfoosocket')

Client:

import rfoo

# Waste a bunch of memory before spawning the child. Swap out the RPC below
# for a straight popen to show it otherwise fails. Tweak to suit your
# available system memory.
mem = [x for x in range(100000000)]

c = rfoo.UnixConnection().connect('rfoosocket')

print rfoo.Proxy(c).RPopen('ls -l')

If you need real-time back and forth coprocess interaction with your spawned subprocesses this model probably won't work, but you might be able to hack it in. You'll presumably want to clean up the available args that can be passed to Popen based on your specific needs, but that should all be relatively straightforward.

You should also find it straightforward to launch the server at the start of the client, and to manage the socket file (or port) to be cleaned up on exit.

107

answered Oct 20 '22 06:10

Eric Angell

Related questions
                            
                                Python dictionary with memory of keys that were accessed?
                            
                                App Engine's UnindexedProperty contains strange code
                            
                                Forcing a specific timestamp for files in pythons zipfile
                            
                                Factory-pattern for Selenium webdriver
                            
                                Embed Python in Java on Android
                            
                                Calling Inkscape in Python [duplicate]
                            
                                Clean way to get near-LIFO behavior from multiprocessing.Queue? (or even just *not* near-FIFO)
                            
                                python - Class attributes apparently not inherited
                            
                                [py.test]: test dependencies
                            
                                GitPython equivalent of "git remote show origin"?
                            
                                Simplifying logging in Flask
                            
                                Use of re.MULTILINE and re.DOTALL together python
                            
                                Sphinx documentation processor extension works differently for HTML and LaTeX output?
                            
                                How to find the containing class of a decorated method in Python
                            
                                Pre-signed URLs and x-amz-acl
                            
                                How to create a virtualenv by cloning the current local environment?
                            
                                Block mean of numpy 2D array
                            
                                Hide / Invisible Matplotlib figure
                            
                                How to install npm package from python script?
                            
                                Printed length of a string in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With