I am using subprocess.run()
for some automated testing. Mostly to automate doing:
dummy.exe < file.txt > foo.txt
diff file.txt foo.txt
If you execute the above redirection in a shell, the two files are always identical. But whenever file.txt
is too long, the below Python code does not return the correct result.
This is the Python code:
import subprocess
import sys
def main(argv):
exe_path = r'dummy.exe'
file_path = r'file.txt'
with open(file_path, 'r') as test_file:
stdin = test_file.read().strip()
p = subprocess.run([exe_path], input=stdin, stdout=subprocess.PIPE, universal_newlines=True)
out = p.stdout.strip()
err = p.stderr
if stdin == out:
print('OK')
else:
print('failed: ' + out)
if __name__ == "__main__":
main(sys.argv[1:])
Here is the C++ code in dummy.cc
:
#include <iostream>
int main()
{
int size, count, a, b;
std::cin >> size;
std::cin >> count;
std::cout << size << " " << count << std::endl;
for (int i = 0; i < count; ++i)
{
std::cin >> a >> b;
std::cout << a << " " << b << std::endl;
}
}
file.txt
can be anything like this:
1 100000
0 417
0 842
0 919
...
The second integer on the first line is the number of lines following, hence here file.txt
will be 100,001 lines long.
Question: Am I misusing subprocess.run() ?
Edit
My exact Python code after comment (newlines,rb) is taken into account:
import subprocess
import sys
import os
def main(argv):
base_dir = os.path.dirname(__file__)
exe_path = os.path.join(base_dir, 'dummy.exe')
file_path = os.path.join(base_dir, 'infile.txt')
out_path = os.path.join(base_dir, 'outfile.txt')
with open(file_path, 'rb') as test_file:
stdin = test_file.read().strip()
p = subprocess.run([exe_path], input=stdin, stdout=subprocess.PIPE)
out = p.stdout.strip()
if stdin == out:
print('OK')
else:
with open(out_path, "wb") as text_file:
text_file.write(out)
if __name__ == "__main__":
main(sys.argv[1:])
Here is the first diff:
Here is the input file: https://drive.google.com/open?id=0B--mU_EsNUGTR3VKaktvQVNtLTQ
We should avoid using 'shell=true' in subprocess call to avoid shell injection vulnerabilities. In this call you have to pass a string as a command to the shell. If call_method is user controlled then it can be used to execute any arbitrary command which can affect system.
Setting the shell argument to a true value causes subprocess to spawn an intermediate shell process, and tell it to run the command. In other words, using an intermediate shell means that variables, glob patterns, and other special shell features in the command string are processed before the command is run.
To capture the output of the subprocess. run method, use an additional argument named “capture_output=True”. You can individually access stdout and stderr values by using “output. stdout” and “output.
The main difference is that subprocess. run() executes a command and waits for it to finish, while with subprocess. Popen you can continue doing your stuff while the process finishes and then just repeatedly call Popen. communicate() yourself to pass and receive data to your process.
To reproduce, the shell command:
subprocess.run("dummy.exe < file.txt > foo.txt", shell=True, check=True)
without the shell in Python:
with open('file.txt', 'rb', 0) as input_file, \
open('foo.txt', 'wb', 0) as output_file:
subprocess.run(["dummy.exe"], stdin=input_file, stdout=output_file, check=True)
It works with arbitrary large files.
You could use subprocess.check_call()
in this case (available since Python 2), instead of subprocess.run()
that is available only in Python 3.5+.
Works very well thanks. But then why was the original failing ? Pipe buffer size as in Kevin Answer ?
It has nothing to do with OS pipe buffers. The warning from the subprocess docs that @Kevin J. Chase cites is unrelated to subprocess.run()
. You should care about OS pipe buffers only if you use process = Popen()
and manually read()/write() via multiple pipe streams (process.stdin/.stdout/.stderr
).
It turns out that the observed behavior is due to Windows bug in the Universal CRT. Here's the same issue that is reproduced without Python: Why would redirection work where piping fails?
As said in the bug description, to workaround it:
ReadFile()
directly instead of std::cin
g++
on Windows
The bug affects only text pipes i.e., the code that uses <>
should be fine (stdin=input_file, stdout=output_file
should still work or it is some other bug).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With