I was wondering how I could implement this in Perl:
while ( not the end of the files )
$var1 = read a line from file 1
$var2 = read a line from file 2
# operate on variables
end while
I'm not sure how to read one line at a time from two files in one while
loop.
Seems like you wrote your answer yourself, almost. Just check for eof
for both file handles, like so:
while (not eof $fh1 and not eof $fh2) {
my $var1 = <$fh1>;
my $var2 = <$fh2>;
# do stuff
}
More reading:
Note: I expanded my answer in response to @zostay and @jm666's comments.
The first step in coming up with an efficient, clear, and concise answer to this question starts with the idea that related variables go in an aggregate. So, the array @fh
will contain the filehandles from which we are reading simultaneously.
Then, we can read a line from each filehandle and store them in an array using the <>
operator in conjunction with map. map
takes a transformation rule and a list, and returns another list. Hence:
my @lines = map scalar <$_>, @fh;
takes the filehandles in @fh
, and reads a single line from each (note scalar), and puts those lines in @lines
. This is a one-to-one
transformation of @fh
.
As the documentation for <>
indicates, <>
returns an undefined value if the end-of-file is reached, or there is an error.
Now, one way to check if we successfully read from all files is to check if the number defined lines is the same as the number of filehandles. grep selects elements of a list that satisfy a certain criterion. Hence
@fh == grep defined, my @lines = map <$_>, @fh;
would check if the number of filehandles in @fh
is the same as the number of defined elements in @lines
. However, the @fh
appearing on both sides of this comparison can indeed be confusing, so an alternative way of checking the there are no undefined elements in @lines
is:
0 == grep !defined, my @lines = map <$_>, @fh;
If you want to put that condition in a while loop, you have to write:
while (0 == grep !defined, my @lines = map <$_>, @fh) {
whereas if you go with an until, you can simply write:
until (grep !defined, my @lines = map <$_>, @fh) {
This means "until at least one of the readlines returns an undefined value, execute the body of the loop".
Now, note that Perl's eof
is different than C's eof
. The documentation for Perl's eof
notes that:
Practical hint: you almost never need to use
eof
in Perl, because the input operators typically returnundef
when they run out of data or encounter an error.
If you check eof
every time through the loop, you're doubling your file IO because "this function actually reads a character and then ungetc
s it."
I almost always give a self-contained runnable example with my code. Below, I did not want to rely on any specific files existing on your system, so I use the always available DATA
and STDIN
handles. As opposed to using the eof
function, when you use this method, you don't have to worry about where you're reading from: All you care about is whether a readline on any one of the files returned an undefined value. It can also be used with any number of filehandles. Also, you really don't have put the filehandles in an array, but as I said, related variables belong in an aggregate, so if you find yourself typing stuff like
my $var1 = <$fh1>;
my $var2 = <$fh2>;
realize that you should have used an array to store the filehandles.
#!/usr/bin/env perl
use strict; use warnings;
my @fh = (\*DATA, \*STDIN);
until (grep !defined, my @lines = map scalar <$_>, @fh) {
print for @lines;
}
__DATA__
one
two
three
This example script will stop asking for your input on STDIN
when the lines in DATA
are exhausted. If you do not have any trailing blank lines in the script, you should have to enter three four lines before the script terminates.
Now, if you want to know which filehandles reached the end, you'd switch to using something like:
#!/usr/bin/env perl
use strict; use warnings;
my @fh = (\*DATA, \*STDIN);
while (1) {
my @lines = map scalar <$_>, @fh;
if (my @eof = grep !defined($lines[$_]), 0 .. $#fh) {
warn "Could not read from filehandle(s) '@eof'";
last;
}
print for @lines;
}
__DATA__
one
two
three
The loops above are designed to stop when any one of the files is exhausted. On the other hand, you might want the loops to run until all of the files are exhausted. In that case, you'd use:
while (grep defined, my @lines = map scalar <$_>, @fh) {
Another easy solution without explicit eof()
checking would go like this:
while (defined(my $var1 = <$fh1>) and defined(my $var2 = <$fh2>)) {
# do stuff
}
This uses the fact that <>
returns undef
if & only if you're at the end of the file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With