I found lots of similar questions asking size of an object at run time in python. Some of the answers suggests to set a limit on amount of memory of sub-process. I do not want to set a limit on memory of sub-process. Here is what I want --
I'm using subprocess.Popen()
to execute an external program. I can, very well, get standard output and error with process.stdout.readlines()
and process.stderr.readlines()
after the process is complete.
I have a problem when an erroneous program gets into an infinite loop and keeps producing output. Since subprocess.Popen()
stores output data in memory this infinite loop quickly eats up entire memory and program slows down.
One solution is that I can run the command with timeout. But programs take variable time to complete. Large timeout, for a program taking small time and having an infinite loop, defeats the purpose of having it.
Is there any simple way where I can put an upper limit say 200MB on amount of data the command can produce? If it exceeds the limit command should get killed.
The main difference is that subprocess. run() executes a command and waits for it to finish, while with subprocess. Popen you can continue doing your stuff while the process finishes and then just repeatedly call Popen. communicate() yourself to pass and receive data to your process.
Popen Function The function should return a pointer to a stream that may be used to read from or write to the pipe while also creating a pipe between the calling application and the executed command. Immediately after starting, the Popen function returns data, and it does not wait for the subprocess to finish.
Popen do we need to close the connection or subprocess automatically closes the connection? Usually, the examples in the official documentation are complete. There the connection is not closed. So you do not need to close most probably.
Popen is nonblocking. call and check_call are blocking. You can make the Popen instance block by calling its wait or communicate method.
First: It is not subprocess.Popen()
storing the data, but it is the pipe between "us" and "our" subprocess.
You shouldn't use readlines()
in this case as this will indefinitely buffer the data and only at the end return them as a list (in this case, it is indeed this function which stores the data).
If you do something like
bytes = lines = 0
for line in process.stdout:
bytes += len(line)
lines += 1
if bytes > 200000000 or lines > 10000:
# handle the described situation
break
you can act as wanted in your question. But you shouldn't forget to kill the subprocess afterwards in order to stop it producing further data.
But if you want to take care of stderr
as well, you'd have to try to replicate process.communicate()
's behaviour with select()
etc., and act appropriately.
There doesn't seem to be an easy answer to what you want
http://linux.about.com/library/cmd/blcmdl2_setrlimit.htm
rlimit has a flag to limit memory, CPU or number of open files, but apparently nothing to limit the amount of I/O.
You should handle the case manually as already described.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With