I went through the documentation of open3 and here is the portion that I could not understand:
If you try to read from the child's stdout writer and their stderr writer, you'll have problems with blocking, which means you'll want to use select() or the IO::Select, which means you'd best use sysread() instead of readline() for normal stuff.
This is very dangerous, as you may block forever. It assumes it's going to talk to something like bc, both writing to it and reading from it. This is presumably safe because you "know" that commands like bc will read a line at a time and output a line at a time. Programs like sort that read their entire input stream first, however, are quite apt to cause deadlock.
So I tried out open3
hoping to know it better. Here is the first attempt:
sub hung_execute {
my($cmd) = @_;
print "[COMMAND]: $cmd\n";
my $pid = open3(my $in, my $out, my $err = gensym(), $cmd);
print "[PID]: $pid\n";
waitpid($pid, 0);
if(<$err>) {
print "[ERROR] : $_" while(<$err>);
die;
}
print "[OUTPUT]: $_" while (<$out>);
}
It's interesting to note that I must initialize $err
here.
Anyway, this just hangs when I execute("sort $some_file");
given that $some_file
is a text file containing more than 4096 chars (limits for my machine).
I then looked into this FAQ, and below was my new version of execute:
sub good_execute {
my($cmd) = @_;
print "[COMMAND]: $cmd\n";
my $in = gensym();
#---------------------------------------------------
# using $in, $out doesn't work. it expects a glob?
local *OUT = IO::File->new_tmpfile;
local *ERR = IO::File->new_tmpfile;
my $pid = open3($in, ">&OUT", ">&ERR", $cmd);
print "[PID]: $pid\n";
waitpid($pid, 0);
seek $_, 0, 0 for \*OUT, \*ERR;
if(<ERR>) {
print "[ERROR] : $_" while(<ERR>);
die;
}
print "[OUTPUT]: $_" while (<OUT>);
}
The sort
command executes fine now, but I can't figure out why.
[Update] After reading @tchrist's answer, I read IO::Select
, and after some more googling, have come up with this version of execute
:
sub good_execute {
my($cmd) = @_;
print "[COMMAND]: $cmd\n";
my $pid = open3(my $in, my $out, my $err = gensym(), $cmd);
print "[PID]: $pid\n";
my $sel = new IO::Select;
$sel->add($out, $err);
while(my @fhs = $sel->can_read) {
foreach my $fh (@fhs) {
my $line = <$fh>;
unless(defined $line) {
$sel->remove($fh);
next;
}
if($fh == $out) {
print "[OUTPUT]: $line";
}elsif($fh == $err) {
print "[ERROR] : $line";
}else{
die "[ERROR]: This should never execute!";
}
}
}
waitpid($pid, 0);
}
This is working fine, and a few things have become clearer now. But the overall picture is still a little hazy.
So my questions are:
hung_execute
?good_execute
works because of the >&
in the open3 call. But why and how?good_execute
did not work when I used lexical variables (my $out
instead of OUT
) for filehandles. It gave this error: open3: open(GLOB(0x610920), >&main::OUT) failed: Invalid argument
. Why so?You’ve encountered the very problems I wrote about in the documentation, and then some. You’re deadlocking because you are waiting for the child to exit before you read from it. If it has more than a pipe buffer of output, it will block and next exit. Plus you haven’t closed your ends of the handles.
You have other errors, too. You cannot test for output on a handle that way, because you just did a blocking readline and discarded its results. Furthermore, if you try to read all the stderr before the stdout, and if there is more than a pipe buffer of output on stdout, then your child will block writing to stdout while you block reading from his stderr.
You really have to use select
, or IO::Select
, to do this correctly. You must only read from a handle when there is output available on that handle, and you must not mix buffered calls with select
, either, unless you are very very lucky.
hung_execute
:
Parent Child
------------------------ ------------------------
Waits for child to exit
Writes to STDOUT
Writes to STDOUT
...
Writes to STDOUT
Tries to write to STDOUT
but the pipe is full,
so it blocks until the
pipe is emptied some.
Deadlock!
good_execute
:
Parent Child
------------------------ ------------------------
Waits for data
Writes to STDOUT
Reads the data
Waits for data
Writes to STDOUT
Reads the data
Waits for data
... ...
Writes to STDOUT
Reads the data
Waits for data
Exits, closing STDOUT
Reads EOF
Waits for child to exit
The pipe could get full, blocking the child; but the parent will come around to empty it soon enough, unblocking the child. No deadlock.
">&OUT"
evaluates to >&OUT
. (No variables to interpolate)
">&$OUT"
evaluates to >&GLOB(0x########)
. (You interpolated $OUT
.)
There is a way of passing lexical file handles (or rather its descriptor), but there's a bug concerning them, so I always use package variables with open3
.
STDOUT and STDERR are independent (unless you do something like 2>&1
, and even then, they'll have separate flags and buffers). You came to the wrong conclusion if you discovered that they're not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With