Python subprocess module much slower than commands (deprecated)

Tags:

So I wrote a script that accesses a bunch of servers using nc on the command line, and originally I was using Python's commands module and calling commands.getoutput() and the script ran in about 45 seconds. Since commands is deprecated, I want to change everything over to using the subprocess module, but now the script takes 2m45s to run. Anyone have an idea of why this would be?

What I had before:

output = commands.getoutput("echo get file.ext | nc -w 1 server.com port_num")

now I have

p = Popen('echo get file.ext | nc -w 1 server.com port_num', shell=True, stdout=PIPE)
output = p.communicate()[0]

Thanks in advance for the help!

884

asked Jun 04 '12 21:06

thedrick

1 Answers

I would expect subprocess to be slower than command. Without meaning to suggest that this is the only reason your script is running slowly, you should take a look at the commands source code. There are fewer than 100 lines, and most of the work is delegated to functions from os, many of which are taken straight from c posix libraries (at least in posix systems). Note that commands is unix-only, so it doesn't have to do any extra work to ensure cross-platform compatibility.

Now take a look at subprocess. There are more than 1500 lines, all pure Python, doing all sorts of checks to ensure consistent cross-platform behavior. Based on this, I would expect subprocess to run slower than commands.

I timed the two modules, and on something quite basic, subprocess was almost twice as slow as commands.

>>> %timeit commands.getoutput('echo "foo" | cat')
100 loops, best of 3: 3.02 ms per loop
>>> %timeit subprocess.check_output('echo "foo" | cat', shell=True)
100 loops, best of 3: 5.76 ms per loop

Swiss suggests some good improvements that will help your script's performance. But even after applying them, note that subprocess is still slower.

>>> %timeit commands.getoutput('echo "foo" | cat')
100 loops, best of 3: 2.97 ms per loop
>>> %timeit Popen('cat', stdin=PIPE, stdout=PIPE).communicate('foo')[0]
100 loops, best of 3: 4.15 ms per loop

Assuming you are performing the above command many times in a row, this will add up, and account for at least some of the performance difference.

In any case, I am interpreting your question as being about the relative performance of subprocess and command, rather than being about how to speed up your script. For the latter question, Swiss's answer is better.

103

answered Sep 26 '22 08:09

senderle

Related questions
                            
                                Gauss-Legendre Algorithm in python
                            
                                Pros and cons of IronPython and IronPython Studio
                            
                                Best way to denormalize data in Django? [closed]
                            
                                PyObjC development with Xcode 3.2
                            
                                Assignment to None
                            
                                Django: Change models without clearing all data?
                            
                                Not all of arguments converted during string formatting
                            
                                Modifying a Python dictionary from different threads
                            
                                Passing dict to constructor?
                            
                                Why did I need to specify a specific class to import in python?
                            
                                Boost::Python- possible to automatically convert from dict --> std::map?
                            
                                Python logging with context
                            
                                Updating context data in FormView form_valid method?
                            
                                Automatically Type Cast Parameters In Python
                            
                                Make matplotlib autoscaling ignore some of the plots
                            
                                Add a directory to Python sys.path so that it's included each time I use Python
                            
                                Can't import Webkit from gi.repository
                            
                                How to keep a socket open until client closes it?
                            
                                Limiting Python input strings to certain characters and lengths
                            
                                python sqlalchemy get column names dynamically?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python subprocess module much slower than commands (deprecated)

Tags:

performance

python

subprocess

command

thedrick

People also ask

1 Answers

senderle

Recent Activity

Donate For Us