Python memory allocation error using subprocess.Popen

Tags:

I am doing some bioinformatics work. I have a python script that at one point calls a program to do an expensive process (sequence alignment..uses a lot of computational power and memory). I call it using subprocess.Popen. When I run it on a testcase, it completes and finishes fine. However, when I run it on the full file, where it would have to do this multiple times for different sets of inputs, it dies. Subprocess throws:

OSError: [Errno 12] Cannot allocate memory

I found a few links here and here and here to similar problems, but I'm not sure that they apply in my case.

By default, the sequence aligner will try to request 51000M of memory. It doesn't always use that much, but it might. With the full input loaded and processed, that much is not available. However, capping the amount it requests or will attempt to use at a lower amount that might be available when running still gives me the same error. I've also tried running with shell=True and same thing.

This has been bugging me for a few days now. Thanks for any help.

Edit: Expanding the traceback:

File "..../python2.6/subprocess.py", line 1037, in _execute_child
    self.pid=os.fork()
OSError: [Errno 12] Cannot allocate memory

throws the error.

Edit2: Running in python 2.6.4 on 64 bit ubuntu 10.4

943

asked Mar 15 '11 00:03

jmerkin

1 Answers

I feel really sorry for the OP. 6 years later and no one mentioned that this is a very common problem in Unix, and actually has nothing to do with python or bioinformatics. A call to os.fork() temporarily doubles the memory of the parent process (the memory of the parent process must be available to the child process), before throwing it all away to do an exec(). While this memory isn't always actually copied, the system must have enough memory to allow for it to be copied, and thus if you're parent process is using more than half of the system memory and you subprocess out even "wc -l", you're going to run into a memory error.

The solution is to use posix_spawn, or create all your subprocesses at the beginning of the script, while memory consumption is low, then use them later on after the parent process has done it's memory-intensive thing.

A google search using the keyworks "os.fork" and "memory" will show several Stack Overflow posts on the topic that can further explain what's going on :)

100

answered Sep 23 '22 18:09

J.J

Related questions
                            
                                Downsampling the number of entries in a list (without interpolation)
                            
                                Micropython or minimal python installation
                            
                                Does it make sense to check for identity in __eq__?
                            
                                Regex to match Domain.CCTLD
                            
                                Mutex locks vs Threading locks. Which to use?
                            
                                Efficiently carry out multiple string replacements in Python
                            
                                Can Python be good alternative for web app that would otherwise be done in Java EE? [closed]
                            
                                Is there a Python idiom for evaluating a list of functions/expressions with short-circuiting?
                            
                                How to strip source from distutils binary distributions?
                            
                                C++ vs Python precision
                            
                                Method signature for Jacobian of a least squares function in scipy
                            
                                Why doesn't a %en0 suffix work to connect a link-local IPv6 TCP socket in Python?
                            
                                Add new keys to a dictionary while incrementing existing values
                            
                                What would be the Python code to add time to a specific timestamp?
                            
                                numpy: efficiently reading a large array
                            
                                WSGI Middleware for OAuth authentication
                            
                                Get the value of specific JSON element in Python
                            
                                Normalizing street addresses in Django/Python
                            
                                How to draw the largest polygon from a set of points
                            
                                python ubuntu virtualenv -> error

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python memory allocation error using subprocess.Popen

Tags:

python

memory-management

subprocess

jmerkin

People also ask

1 Answers

J.J

Recent Activity

Donate For Us