Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Printing to stdout in IPython parallel processes

I'm new to IPython and would like to print intermediate results to stdout while running IPython parallel cluster functions. (I'm aware that with multiple processes, this might mangle the output, but that's fine--it's just for testing/debugging, and the processes I'd be running are long enough that such a collision is unlikely.) I checked the documentation for IPython but can't find an example where the parallelized function prints. Basically, I'm looking for a way to redirect the print output of the subprocesses to the main stdout, the IPython equivalent of

subprocess.Popen( ... , stdout=...)

Printing inside the process doesn't work:

rc = Client()
dview = rc()
def ff(x):
    print(x)
    return x**2
sync = dview.map_sync(ff,[1,2,3,4])
print('sync res=%s'%repr(sync))
async = dview.map_async(ff,[1,2,3,4])
print('async res=%s'%repr(async))
print(async.display_outputs())

returns

sync res=[1, 4, 9, 16]
async res=[1, 4, 9, 16]

So the computation executes correctly, but the print statement in the function ff is never printed, not even when all the processes have returned. What am I doing wrong? How do I get "print" to work?

like image 232
user1470788 Avatar asked Mar 08 '13 07:03

user1470788


1 Answers

It's actually more similar to subprocess.Popen( ... , stdout=PIPE) than you seem to be expecting. Just like the Popen object has a stdout attribute, which you can read to see the stdout of the subprocess, An AsyncResult has a stdout attribute that contains the stdout captured from the engines. It does differ in that AsyncResult.stdout is a list of strings, where each item in the list is the stdout of a single engine as a string.

So, to start out:

rc = parallel.Client()
dview = rc[:]
def ff(x):
    print(x)
    return x**2
sync = dview.map_sync(ff,[1,2,3,4])
print('sync res=%r' % sync)
async = dview.map_async(ff,[1,2,3,4])
print('async res=%r' % async)
async.get()

gives

sync res=[1, 4, 9, 16]
async res=<AsyncMapResult: ff>

We can see the AsyncResult.stdout list of strings:

print(async.stdout)
['1\n2\n', '3\n4\n']

We can see the stdout of the async result:

print('async output:')
async.display_outputs()

which prints:

async output:
[stdout:0] 
1
2
[stdout:1] 
3
4

And here is a notebook with all of this demonstrated.

Some things to note, based on your question:

  1. you have to wait for the AsyncResult to finish, before outputs are ready (async.get())
  2. display_outputs() does not return anything - it actually does the printing/displaying itself, so print(async.display_outputs()) doesn't make sense.
like image 112
minrk Avatar answered Nov 07 '22 22:11

minrk