Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unittesting Python code which uses subprocess.Popen

I have a Python project in which I read external files, process them, and write the results to a new file. The input files can either be read directly, or extracted from a git repository using git show. The function to call git show and return stdout looks like this:

def git_show(fname, rev):
    '''Runs git show and returns stdout'''
    process = subprocess.Popen(['git', 'show', '{}:{}'.format(rev, fname)],
                               stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    stdout, stderr = process.communicate()
    ret_code = process.wait()
    if ret_code:
        raise Exception(stderr)
    return stdout

I have unittests which test the whole processing part of the program, i.e., everything apart from reading and writing the files. However, I have stumbled upon (and fixed) issues regarding the encoding of the returned string from git_show(), depending Python version, and quite possibly OS and the actual file to read.

I would like to set up a unittest for git_show() so I can make sure the whole application works, from input to output. However, as far as I know, this is not possible without having an actual git repository to test on. The whole package is version managed with git, and I expect that if I have a git repository inside a git repository that might lead to problems on its own, and a voice in my head tells me that might not be the best solution anyway.

How can one best achieve unittesting code which gets input from git show (and in general, the command line / Popen.communicate())?

like image 603
cmeeren Avatar asked May 29 '15 11:05

cmeeren


People also ask

How do I use subprocess Popen in Python?

The recommended approach to invoking subprocesses is to use the run() function for all use cases it can handle. For more advanced use cases, the underlying Popen interface can be used directly. Run the command described by args. Wait for command to complete, then return a CompletedProcess instance.

Should I use Popen or subprocess?

The main difference is that subprocess. run() executes a command and waits for it to finish, while with subprocess. Popen you can continue doing your stuff while the process finishes and then just repeatedly call Popen. communicate() yourself to pass and receive data to your process.

What does Python subprocess Popen return?

Popen Function The function should return a pointer to a stream that may be used to read from or write to the pipe while also creating a pipe between the calling application and the executed command. Immediately after starting, the Popen function returns data, and it does not wait for the subprocess to finish.


1 Answers

Perhaps you want (one of combination of) different kinds of tests.

Unit tests

Test a small part of your code, within your code.

  1. mock out subprocess.Popen
  2. return static values in stdout, stderr
  3. check that processing is correct

Sample code is pretty small, you can only test that stdout is really returned and that upon non-zero wait() an exception is raised.

Something in between

Test vectors, that is given set input, set output should be produced

  1. mock out git, instead use cat vector1.txt encoded in specific way
  2. test result

Integration tests

Test how your code connects to external entities, in this case git. Such tests protects you from accidentally changing the expectation of the internal system. That is it "freezes" the API.

  1. create a tarball with a small git repository
  2. optionally pack git binary into same tarball
  3. unpack the tarball
  4. run git command
  5. compare output to expected
like image 60
Dima Tisnek Avatar answered Oct 28 '22 22:10

Dima Tisnek