Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python write both commands and their output to a file

Is there any way to write both commands and their output to an external file?

Let's say I have a script outtest.py :

import random
from statistics import median, mean

d = [random.random()**2 for _ in range(1000)]
d_mean = round(mean(d), 2)
print(f'mean: {d_mean}')
d_median = round(median(d), 2)
print(f'median: {d_median}')

Now if I want to capture its output only I know I can just do:

python3 outtest.py > outtest.txt

However, this will only give me an outtest.txt file with e.g.:

mean: 0.34
median: 0.27

What I'm looking for is a way to get an output file like:

import random
from statistics import median, mean

d = [random.random()**2 for _ in range(1000)]
d_mean = round(mean(d), 2)
print(f'mean: {d_mean}')
>> mean: 0.34
d_median = round(median(d), 2)
print(f'median: {d_median}')
>> median: 0.27

Or some other format (markdown, whatever). Essentially, something like jupyter notebook or Rmarkdown but with using standard .py files.

Is there any easy way to achieve this?

like image 944
pieca Avatar asked Feb 19 '20 09:02

pieca


People also ask

What is the command for output in Python?

Python | Output using print() function.

How do you write output to a file in Terminal?

the shortcut is Ctrl + Shift + S ; it allows the output to be saved as a text file, or as HTML including colors!


2 Answers

Here's a script I just wrote which quite comprehensively captures printed output and prints it alongside the code, no matter how it's printed or how much is printed in one go. It uses the ast module to parse the Python source, executes the program one statement at a time (kind of as if it was fed to the REPL), then prints the output from each statement. Python 3.6+ (but easily modified for e.g. Python 2.x):

import ast
import sys

if len(sys.argv) < 2:
    print(f"Usage: {sys.argv[0]} <script.py> [args...]")
    exit(1)

# Replace stdout so we can mix program output and source code cleanly
real_stdout = sys.stdout
class FakeStdout:
    ''' A replacement for stdout that prefixes # to every line of output, so it can be mixed with code. '''
    def __init__(self, file):
        self.file = file
        self.curline = ''

    def _writerow(self, row):
        self.file.write('# ')
        self.file.write(row)
        self.file.write('\n')

    def write(self, text):
        if not text:
            return
        rows = text.split('\n')
        self.curline += rows.pop(0)
        if not rows:
            return
        for row in rows:
            self._writerow(self.curline)
            self.curline = row

    def flush(self):
        if self.curline:
            self._writerow(self.curline)
            self.curline = ''

sys.stdout = FakeStdout(real_stdout)

class EndLineFinder(ast.NodeVisitor):
    ''' This class functions as a replacement for the somewhat unreliable end_lineno attribute.

    It simply finds the largest line number among all child nodes. '''

    def __init__(self):
        self.max_lineno = 0

    def generic_visit(self, node):
        if hasattr(node, 'lineno'):
            self.max_lineno = max(self.max_lineno, node.lineno)
        ast.NodeVisitor.generic_visit(self, node)

# Pretend the script was called directly
del sys.argv[0]

# We'll walk each statement of the file and execute it separately.
# This way, we can place the output for each statement right after the statement itself.
filename = sys.argv[0]
source = open(filename, 'r').read()
lines = source.split('\n')
module = ast.parse(source, filename)
env = {'__name__': '__main__'}

prevline = 0
endfinder = EndLineFinder()

for stmt in module.body:
    # note: end_lineno will be 1-indexed (but it's always used as an endpoint, so no off-by-one errors here)
    endfinder.visit(stmt)
    end_lineno = endfinder.max_lineno
    for line in range(prevline, end_lineno):
        print(lines[line], file=real_stdout)
    prevline = end_lineno
    # run a one-line "module" containing only this statement
    exec(compile(ast.Module([stmt]), filename, 'exec'), env)
    # flush any incomplete output (FakeStdout is "line-buffered")
    sys.stdout.flush()

Here's a test script:

print(3); print(4)
print(5)

if 1:
    print(6)

x = 3
for i in range(6):
    print(x + i)

import sys
sys.stdout.write('I love Python')

import pprint
pprint.pprint({'a': 'b', 'c': 'd'}, width=5)

and the result:

print(3); print(4)
# 3
# 4
print(5)
# 5

if 1:
    print(6)
# 6

x = 3
for i in range(6):
    print(x + i)
# 3
# 4
# 5
# 6
# 7
# 8

import sys
sys.stdout.write('I love Python')
# I love Python

import pprint
pprint.pprint({'a': 'b', 'c': 'd'}, width=5)
# {'a': 'b',
#  'c': 'd'}
like image 78
nneonneo Avatar answered Oct 06 '22 16:10

nneonneo


You can call the script and inspect its output. You'll have to make some assumptions though. The output is only from stdout, only from lines containing the string "print", and each print produces only one line of output. That being the case, an example command to run it:

> python writer.py script.py

And the script would look like this:

from sys import argv
from subprocess import run, PIPE

script = argv[1]
r = run('python ' + script, stdout=PIPE)
out = r.stdout.decode().split('\n')

with open(script, 'r') as f:
    lines = f.readlines()

with open('out.txt', 'w') as f:
    i = 0
    for line in lines:
        f.write(line)
        if 'print' in line:
            f.write('>>> ' + out[i])
            i += 1

And the output:

import random
from statistics import median, mean

d = [random.random()**2 for _ in range(1000)]
d_mean = round(mean(d), 2)
print(f'mean: {d_mean}')
>>> mean: 0.33
d_median = round(median(d), 2)
print(f'median: {d_median}')
>>> median: 0.24

More complex cases with multiline output or other statements producing output, this won't work. I guess it would require some in depth introspection.

like image 30
Felix Avatar answered Oct 06 '22 16:10

Felix