I am using <code>subprocess.run()</code> for some automated testing. Mostly to automate doing: <pre class="prettyprint"><code>dummy.exe < file.txt > foo.txt diff file.txt foo.txt </code></pre> If you execute the above redirection in a shell, the two files are always identical. But whenever <code>file.txt</code> is too long, the below Python code does not return the correct result. This is the Python code: <pre class="prettyprint lang-python3 prettyprint-override"><code>import subprocess import sys def main(argv): exe_path = r'dummy.exe' file_path = r'file.txt' with open(file_path, 'r') as test_file: stdin = test_file.read().strip() p = subprocess.run([exe_path], input=stdin, stdout=subprocess.PIPE, universal_newlines=True) out = p.stdout.strip() err = p.stderr if stdin == out: print('OK') else: print('failed: ' + out) if __name__ == "__main__": main(sys.argv[1:]) </code></pre> Here is the C++ code in <code>dummy.cc</code>: <pre class="prettyprint lang-c++ prettyprint-override"><code>#include <iostream> int main() { int size, count, a, b; std::cin >> size; std::cin >> count; std::cout << size << " " << count << std::endl; for (int i = 0; i < count; ++i) { std::cin >> a >> b; std::cout << a << " " <</pre> <code>file.txt</code> can be anything like this: <pre class="prettyprint lang-none prettyprint-override"><code>1 100000 0 417 0 842 0 919 ... </code></pre> The second integer on the first line is the number of lines following, hence here <code>file.txt</code> will be 100,001 lines long. Question: Am I misusing subprocess.run() ? Edit My exact Python code after comment (newlines,rb) is taken into account: <pre class="prettyprint"><code>import subprocess import sys import os def main(argv): base_dir = os.path.dirname(__file__) exe_path = os.path.join(base_dir, 'dummy.exe') file_path = os.path.join(base_dir, 'infile.txt') out_path = os.path.join(base_dir, 'outfile.txt') with open(file_path, 'rb') as test_file: stdin = test_file.read().strip() p = subprocess.run([exe_path], input=stdin, stdout=subprocess.PIPE) out = p.stdout.strip() if stdin == out: print('OK') else: with open(out_path, "wb") as text_file: text_file.write(out) if __name__ == "__main__": main(sys.argv[1:]) </code></pre> Here is the first diff: <img src="https://i.stack.imgur.com/Fk2IW.jpg" alt="enter image description here"> Here is the input file: https://drive.google.com/open?id=0B--mU_EsNUGTR3VKaktvQVNtLTQ

To reproduce, the shell command: <pre class="prettyprint"><code>subprocess.run("dummy.exe < file.txt > foo.txt", shell=True, check=True) </code></pre> without the shell in Python: <pre class="prettyprint"><code>with open('file.txt', 'rb', 0) as input_file, \ open('foo.txt', 'wb', 0) as output_file: subprocess.run(["dummy.exe"], stdin=input_file, stdout=output_file, check=True) </code></pre> It works with arbitrary large files. You could use <code>subprocess.check_call()</code> in this case (available since Python 2), instead of <code>subprocess.run()</code> that is available only in Python 3.5+. <blockquote> Works very well thanks. But then why was the original failing ? Pipe buffer size as in Kevin Answer ? </blockquote> It has nothing to do with OS pipe buffers. The warning from the subprocess docs that @Kevin J. Chase cites is unrelated to <code>subprocess.run()</code>. You should care about OS pipe buffers only if you use <code>process = Popen()</code> and manually read()/write() via multiple pipe streams (<code>process.stdin/.stdout/.stderr</code>). It turns out that the observed behavior is due to Windows bug in the Universal CRT. Here's the same issue that is reproduced without Python: Why would redirection work where piping fails? As said in the bug description, to workaround it: <ul> <li> "use a binary pipe and do text mode CRLF => LF translation manually on the reader side" or use <code>ReadFile()</code> directly instead of <code>std::cin</code> </li> <li>or wait for Windows 10 update this summer (where the bug should be fixed)</li> <li>or use a different C++ compiler e.g., there is no issue if you use <code>g++</code> on Windows </li> </ul> The bug affects only text pipes i.e., the code that uses <code><></code> should be fine (<code>stdin=input_file, stdout=output_file</code> should still work or it is some other bug).

Why is subprocess.run output different from shell output of same command?

Tags:

c++

python

subprocess

python-3.x

io-redirection

I am using subprocess.run() for some automated testing. Mostly to automate doing:

dummy.exe < file.txt > foo.txt
diff file.txt foo.txt

If you execute the above redirection in a shell, the two files are always identical. But whenever file.txt is too long, the below Python code does not return the correct result.

This is the Python code:

import subprocess
import sys


def main(argv):

    exe_path = r'dummy.exe'
    file_path = r'file.txt'

    with open(file_path, 'r') as test_file:
        stdin = test_file.read().strip()
        p = subprocess.run([exe_path], input=stdin, stdout=subprocess.PIPE, universal_newlines=True)
        out = p.stdout.strip()
        err = p.stderr
        if stdin == out:
            print('OK')
        else:
            print('failed: ' + out)

if __name__ == "__main__":
    main(sys.argv[1:])

Here is the C++ code in dummy.cc:

#include <iostream>


int main()
{
    int size, count, a, b;
    std::cin >> size;
    std::cin >> count;

    std::cout << size << " " << count << std::endl;


    for (int i = 0; i < count; ++i)
    {
        std::cin >> a >> b;
        std::cout << a << " " << b << std::endl;
    }
}

file.txt can be anything like this:

The second integer on the first line is the number of lines following, hence here file.txt will be 100,001 lines long.

Question: Am I misusing subprocess.run() ?

Edit

My exact Python code after comment (newlines,rb) is taken into account:

import subprocess
import sys
import os


def main(argv):

    base_dir = os.path.dirname(__file__)
    exe_path = os.path.join(base_dir, 'dummy.exe')
    file_path = os.path.join(base_dir, 'infile.txt')
    out_path = os.path.join(base_dir, 'outfile.txt')

    with open(file_path, 'rb') as test_file:
        stdin = test_file.read().strip()
        p = subprocess.run([exe_path], input=stdin, stdout=subprocess.PIPE)
        out = p.stdout.strip()
        if stdin == out:
            print('OK')
        else:
            with open(out_path, "wb") as text_file:
                text_file.write(out)

if __name__ == "__main__":
    main(sys.argv[1:])

Here is the first diff:

enter image description here

Here is the input file: https://drive.google.com/open?id=0B--mU_EsNUGTR3VKaktvQVNtLTQ

441

asked Jun 09 '16 19:06

user2346536

1 Answers

To reproduce, the shell command:

subprocess.run("dummy.exe < file.txt > foo.txt", shell=True, check=True)

without the shell in Python:

with open('file.txt', 'rb', 0) as input_file, \
     open('foo.txt', 'wb', 0) as output_file:
    subprocess.run(["dummy.exe"], stdin=input_file, stdout=output_file, check=True)

It works with arbitrary large files.

You could use subprocess.check_call() in this case (available since Python 2), instead of subprocess.run() that is available only in Python 3.5+.

Works very well thanks. But then why was the original failing ? Pipe buffer size as in Kevin Answer ?

It has nothing to do with OS pipe buffers. The warning from the subprocess docs that @Kevin J. Chase cites is unrelated to subprocess.run(). You should care about OS pipe buffers only if you use process = Popen() and manually read()/write() via multiple pipe streams (process.stdin/.stdout/.stderr).

It turns out that the observed behavior is due to Windows bug in the Universal CRT. Here's the same issue that is reproduced without Python: Why would redirection work where piping fails?

As said in the bug description, to workaround it:

"use a binary pipe and do text mode CRLF => LF translation manually on the reader side" or use ReadFile() directly instead of std::cin
or wait for Windows 10 update this summer (where the bug should be fixed)
or use a different C++ compiler e.g., there is no issue if you use g++ on Windows

The bug affects only text pipes i.e., the code that uses <> should be fine (stdin=input_file, stdout=output_file should still work or it is some other bug).

131

answered Oct 20 '22 01:10

jfs

Related questions
                            
                                Python subprocess introduces spaces
                            
                                What is the point of the Sphinx highlight_language config option if code-block:: doesn't have an optional argument?
                            
                                Cannot import pyodbc on Mac
                            
                                check unittest.mock call arguments agnostically w.r.t. whether they have been passed as positional arguments or keyword arguments
                            
                                Insert data in AWS Redshift via AWS Lambda
                            
                                Pandas and Cassandra: numpy array format incompatibility
                            
                                python __main__ and __init__ proper usage
                            
                                SciPy interp2D for pairs of coordinates
                            
                                Create possible combinations of specific size
                            
                                How can I use values read from TFRecords as arguments to tf.reshape?
                            
                                how could I use complete penn treebank dataset inside python/nltk
                            
                                python read file non blocking on windows
                            
                                How to fit different inputs into an sklearn Pipeline?
                            
                                python numpy: Change the column type of a numpy matrix
                            
                                Undo L2 Normalization in sklearn python
                            
                                Error solving Matrix equation with numpy
                            
                                batch upload videos to youtube via command line python
                            
                                Python keras how to transform a dense layer into a convolutional layer
                            
                                How to get parameter arguments from a frozen spicy.stats distribution?
                            
                                How to inherit a python generator and overwrite __iter__

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With