I apologize if this question sounds simple, my intention is to understand in depth how this (these?) particular operator(s) works and I was unable to find a satisfactory description in the perldocs (It probably exists somewhere, I just couldn't find it for the life of me)
Particularly, I am interested in knowing if
a) <>
b) <*>
or whatever glob and
c) <FH>
are fundamentally similar or different, and how they are used internally.
I built my own testing functions to gain some insight on this (presented below). I still don't have a full understanding (my understanding might even be wrong) but this is what I've concluded:
<>
<list of globs>
<FH>
just seems to return undef when assigned to a variable.
Questions: Why is it undef? Does it not have a type? Does this behave similarly when the FH is not a bareword filehandle?General Question: What is it that handles the value of <> and the others during execution? In scalar context, is any sort of reference returned, or are the variables that we assign them to, at that point identical to any other non-ref scalar?
I also noticed that even though I am assigning them in sequence, the output is reset each time. i.e. I would have assumed that when I do
$thing_s = <>;
@thing_l = <>;
@thing_l
would be missing the first item, since it was already received by $thing_s
. Why is this not the case?
Code used for testing:
use strict;
use warnings;
use Switch;
use Data::Dumper;
die "Call with a list of files\n" if (@ARGV<1);
my @whats = ('<>','<* .*>','<FH>');
my $thing_s;
my @thing_l;
for my $what(@whats){
switch($what){
case('<>'){
$thing_s = <>;
@thing_l = <>;
}
case('<* .*>'){
$thing_s = <* .*>;
@thing_l = <* .*>;
}
case('<FH>'){
open FH, '<', $ARGV[0];
$thing_s = <FH>;
@thing_l = <FH>;
}
}
print "$what in scalar context is: \n".Dumper($thing_s)."\n";
print "$what in list context is: \n".Dumper(@thing_l)."\n";
}
The $fh (file handle) is a scalar variable and we can define it inside or before the open() function. Here we have define it inside the function. The '>' sign means we are opening this file for writing.
f: File is a plain file. - d: File is a directory. - l: File is a symbolic link. - p: File is a named pipe (FIFO), or Filehandle is a pipe. - S: File is a socket. -
The => operator in perl is basically the same as comma. The only difference is that if there's an unquoted word on the left, it's treated like a quoted word. So you could have written Martin => 28 which would be the same as 'Martin', 28 .
$$ - The process number of the Perl running this script. $0 - Contains the name of the program being executed. $( - The real gid of this process. $) - The effective gid of this process. $< - The real uid of this process.
The <>
thingies are all iterators. All of these variants have common behaviour:
undef
once the iterator is exhausted.These last two properties make it suitable for use as a condition in while
loops.
There are two kinds of iterators that can be used with <>
:
<$fh>
is equivalent to readline $fh
.<* .*>
is equivalent to glob '* .*'
.The <>
is parsed as a readline when it contains either nothing, a bareword, or a simple scalar. More complex expression can be embedded like <{ ... }>
.
It is parsed as a glob in all other cases. This can be made explicit by using quotes: <"* .*">
but you should really be explicit and use the glob
function instead.
Some details differ, e.g. where the iterator state is kept:
Another part is if the iterator can restart:
undef
.If no file handle is used in <>
, then this defaults to the special ARGV
file handle. The behaviour of <ARGV>
is as follows:
@ARGV
is empty, then ARGV
is STDIN
.Otherwise, the elements of @ARGV
are treated as file names. The following pseudocode is executed:
$ARGV = shift @ARGV;
open ARGV, $ARGV or die ...; # careful! no open mode is used
The $ARGV
scalar holds the filename, and the ARGV
file handle holds that file handle.
ARGV
would be eof
, the next file from @ARGV
is opened.@ARGV
is completely empty can <>
return undef
.This can actually be used as a trick to read from many files:
local @ARGV = qw(foo.txt bar.txt baz.txt);
while (<>) {
...;
}
What is it that handles the value of
<>
and the others during execution?
The Perl compiler is very context-aware, and often has to choose between multiple ambiguous interpretations of a code segment. It will compile <>
as a call to readline
or to glob
depending on what is inside the brackets.
In scalar context, is any sort of reference returned, or are the variables that we assign them to, at that point identical to any other non-ref scalar?
I'm not sure what you're asking here, or why you think the variables that take the result of a <>
should be any different from other variables. They are always simple string values: either a filename returned by glob
, or some file data returned by readline
.
<FH>
just seems to return undef when assigned to a variable. Questions: Why is it undef? Does it not have a type? Does this behave similarly when the FH is not a bareword filehandle?
This form will treat FH
as a filehandle, and return the next line of data from the file if it is open and not at eof. Otherwise undef
is returned, to indicate that nothing valid could be read. Perl is very flexible with types, but undef
behaves as its own type, like Ruby's nil
. The operator behaves the same whether FH
is a global file handle or a (variable that contains) a reference to a typeglob.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With