Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Repeatedly write to stdin and read from stdout of a process from python

Tags:

python

I have a piece of fortran code that reads some numbers from STDIN and writes results to STDOUT. For example:

do
  read (*,*) x
  y = x*x
  write (*,*) y
enddo

So I can start the program from a shell and get the following sequence of inputs/outputs:

5.0
25.0
2.5
6.25

Now I need to do this from within python. After futilely wrestling with subprocess.Popen and looking through old questions on this site, I decided to use pexpect.spawn:

import pexpect, os
p = pexpect.spawn('squarer')
p.setecho(False)
p.write("2.5" + os.linesep)
res = p.readline()

and it works. The problem is, the real data I need to pass between python and my fortran program is an array of 100,000 (or more) double precision floats. If they're contained in an array called x, then

p.write(' '.join(["%.10f"%k for k in x]) + os.linesep)

times out with the following error message from pexpect:

buffer (last 100 chars):   
before (last 100 chars):   
after: <class 'pexpect.TIMEOUT'>  
match: None  
match_index: None  
exitstatus: None
flag_eof: False
pid: 8574
child_fd: 3
closed: False
timeout: 30
delimiter: <class 'pexpect.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1

unless x has less than 303 elements. Is there a way to pass large amounts of data to/from STDIN/STDOUT of another program?

I have tried splitting the data into smaller chunks, but then I lose a lot in speed.

Thanks in advance.

like image 746
TM5 Avatar asked Aug 17 '10 15:08

TM5


1 Answers

Found a solution using the subprocess module, so I'm posting it here for reference if anyone needs to do the same thing.

import subprocess as sbp

class ExternalProg:

    def __init__(self, arg_list):
        self.opt = sbp.Popen(arg_list, stdin=sbp.PIPE, stdout=sbp.PIPE, shell=True, close_fds=True)

    def toString(self,x):
        return ' '.join(["%.12f"%k for k in x])

    def toFloat(self,x):
        return float64(x.strip().split())

    def sendString(self,string):
        if not string.endswith('\n'):
            string = string + '\n'
        self.opt.stdin.write(string)

    def sendArray(self,x):
        self.opt.stdin.write(self.toString(x)+'\n')

    def readInt(self):
        return int(self.opt.stdout.readline().strip())

    def sendScalar(self,x):
        if type(x) == int:
            self.opt.stdin.write("%i\n"%x)
        elif type(x) == float:
            self.opt.stdin.write("%.12f\n"%x)

    def readArray(self):
        return self.toFloat(self.opt.stdout.readline())

    def close(self):
        self.opt.kill()

The class is invoked with an external program called 'optimizer' as:

optim = ExternalProg(['./optimizer'])
optim.sendScalar(500) # send the optimizer the length of the state vector, for example
optim.sendArray(init_x) # the initial guess for x
optim.sendArray(init_g) # the initial gradient g
next_x = optim.readArray() # get the next estimate of x
next_g = evaluateGradient(next_x) # calculate gradient at next_x from within python
# repeat until convergence

On the fortran side (the program compiled to give the executable 'optimizer'), a 500-element vector would be read in so:

read(*,*) input_vector(1:500)

and would be written out so:

write(*,'(500f18.11)') output_vector(1:500)

and that's it! I've tested it with state vectors up to 200,000 elements (which is the upper limit of what I need right now). Hope this helps someone other than myself. This solution works with ifort and xlf90, but not with gfortran for some reason I don't understand.

like image 196
TM5 Avatar answered Oct 06 '22 00:10

TM5