While trying to do this:
my $obj = new JavaScript::Minifier;
$obj->minify(*STDIN, *STDOUT);
// modified above line to
$obj->minify(*IP_HANDLE,*OP_HANDLE)
The above works if IP_HANDLE and OP_HANDLE are filehandles but still I am not able to figure out what actually the *
does when applied to a filehandle or any other datatype.
Thanks,
In the bad old days before perl v5.6, which introduced lexical filehandles — more than a decade ago now — passing file- and directory handles was awkward. The code from your question is written using this old-fashioned style.
The technical name for *STDIN
, for example, is a typeglob, explained in the “Typeglobs and Filehandles” section of perldata. You may encounter manipulation of typeglobs for various purposes in legacy code. Note that you may grab typeglobs of global variables only, never lexicals.
Passing handles was a common purpose for dealing directly with typeglobs, but there were other uses as well. See below for details.
*foo{THING}
syntaxThe perldata documentation explains:
Typeglobs and Filehandles
Perl uses an internal type called a typeglob to hold an entire symbol table entry. The type prefix of a typeglob is a
*
because it represents all types. This used to be the preferred way to pass arrays and hashes by reference into a function, but now that we have real references, this is seldom needed.[...]
Another use for typeglobs is to pass filehandles into a function or to create new filehandles. If you need to use a typeglob to save away a filehandle, do it this way:
$fh = *STDOUT;
or perhaps as a real reference, like this:
$fh = \*STDOUT;
See perlsub for examples of using these as indirect filehandles in functions.
The referenced section of perlsub is below.
Passing Symbol Table Entries (typeglobs)
WARNING: The mechanism described in this section was originally the only way to simulate pass-by-reference in older versions of Perl. While it still works fine in modern versions, the new reference mechanism is generally easier to work with. See below.
Sometimes you don’t want to pass the value of an array to a subroutine but rather the name of it, so that the subroutine can modify the global copy of it rather than working with a local copy. In Perl you can refer to all objects of a particular name by prefixing the name with a star:
*foo
. This is often known as a “typeglob,” because the star on the front can be thought of as a wildcard match for all the funny prefix characters on variables and subroutines and such.When evaluated, the typeglob produces a scalar value that represents all the objects of that name, including any filehandle, format, or subroutine. When assigned to, it causes the name mentioned to refer to whatever
*
value was assigned to it. [...]
Note that a typeglob can be taken on global variables only, not lexicals. Heed the warning above. Prefer to avoid this obscure technique.
Without the *
sigil, a bareword is just a string.
Simple strings sometimes suffice, hower. For example, the print
operator allows
$ perl -le 'print { "STDOUT" } "Hiya!"'
Hiya!
$ perl -le '$h="STDOUT"; print $h "Hiya!"'
Hiya!
$ perl -le 'print "STDOUT" +123'
123
These fail with strict 'refs'
enabled. The manual explains:
FILEHANDLE may be a scalar variable name, in which case the variable contains the name of or a reference to the filehandle, thus introducing one level of indirection.
In your example, consider the syntactic ambiguity. Without the *
sigil, you could mean strings
$ perl -MO=Deparse,-p prog.pl
use JavaScript::Minifier;
(my $obj = 'JavaScript::Minifier'->new);
$obj->minify('IP_HANDLE', 'OP_HANDLE');
or maybe a sub call
$ perl -MO=Deparse,-p prog.pl
use JavaScript::Minifier;
sub OP_HANDLE {
1;
}
(my $obj = 'JavaScript::Minifier'->new);
$obj->minify('IP_HANDLE', OP_HANDLE());
or, of course, a filehandle. Note in the examples above how the bareword JavaScript::Minifier
also compiles as a simple string.
Enable the strict
pragma and it all goes out the window anyway:
$ perl -Mstrict prog.pl Bareword "IP_HANDLE" not allowed while "strict subs" in use at prog.pl line 6. Bareword "OP_HANDLE" not allowed while "strict subs" in use at prog.pl line 6.
One trick with typeglobs that’s handy for Stack Overflow posts is
*ARGV = *DATA;
(I could be more precise with *ARGV = *DATA{IO}
, but that’s a little fussy.)
This allows the diamond operator <>
to read from the DATA
filehandle, as in
#! /usr/bin/perl
*ARGV = *DATA; # for demo only; remove in production
while (<>) { print }
__DATA__
Hello
there
This way, the program and its input can be in a single file, and the code is a closer match to how it will look in production: just delete the typeglob assignment.
As noted in perlsub
Temporary Values via
local()
WARNING: In general, you should be using
my
instead oflocal
, because it’s faster and safer. Exceptions to this include the global punctuation variables, global filehandles and formats, and direct manipulation of the Perl symbol table itself.local
is mostly used when the current value of a variable must be visible to called subroutines. [...]
you can use typeglobs to localize filehandles:
$ cat prog.pl
#! /usr/bin/perl
sub foo {
local(*STDOUT);
open STDOUT, ">", "/dev/null" or die "$0: open: $!";
print "You can't see me!\n";
}
print "Hello\n";
foo;
print "Good bye.\n";
$ ./prog.pl
Hello
Good bye.
“When to Still Use local()
” in perlsub has another example.
2. You need to create a local file or directory handle or a local function.
A function that needs a filehandle of its own must use
local()
on a complete typeglob. This can be used to create new symbol table entries:sub ioqueue { local (*READER, *WRITER); # not my! pipe (READER, WRITER) or die "pipe: $!"; return (*READER, *WRITER); } ($head, $tail) = ioqueue();
To emphasize, this style is old-fashioned. Prefer to avoid global filehandles in new code, but being able to understand the technique in existing code is useful.
*foo{THING}
syntaxYou can get at the different parts of a typeglob, as perlref explains:
A reference can be created by using a special syntax, lovingly known as the
*foo{THING}
syntax.*foo{THING}
returns a reference to the THING slot in*foo
(which is the symbol table entry which holds everything known as foo).$scalarref = *foo{SCALAR}; $arrayref = *ARGV{ARRAY}; $hashref = *ENV{HASH}; $coderef = *handler{CODE}; $ioref = *STDIN{IO}; $globref = *foo{GLOB}; $formatref = *foo{FORMAT};
All of these are self-explanatory except for
*foo{IO}
. It returns the IO handle, used for file handles (open
), sockets (socket
andsocketpair
), and directory handles (opendir
). For compatibility with previous versions of Perl,*foo{FILEHANDLE}
is a synonym for*foo{IO}
, though it is deprecated as of 5.8.0. If deprecation warnings are in effect, it will warn of its use.
*foo{THING}
returnsundef
if that particular THING hasn’t been used yet, except in the case of scalars.*foo{SCALAR}
returns a reference to an anonymous scalar if$foo
hasn’t been used yet. This might change in a future release.
*foo{IO}
is an alternative to the*HANDLE
mechanism given in [“Typeglobs and Filehandles” in perldata] for passing filehandles into or out of subroutines, or storing into larger data structures. Its disadvantage is that it won’t create a new filehandle for you. Its advantage is that you have less risk of clobbering more than you want to with a typeglob assignment. (It still conflates file and directory handles, though.) However, if you assign the incoming value to a scalar instead of a typeglob as we do in the examples below, there’s no risk of that happening.splutter(*STDOUT); # pass the whole glob splutter(*STDOUT{IO}); # pass both file and dir handles sub splutter { my $fh = shift; print $fh "her um well a hmmm\n"; } $rec = get_rec(*STDIN); # pass the whole glob $rec = get_rec(*STDIN{IO}); # pass both file and dir handles sub get_rec { my $fh = shift; return scalar <$fh>; }
Context is key with Perl. In your example, although the syntax may be ambiguous, the intent is not: even if the parameters are strings, those strings are clearly intended to name filehandles.
So consider all the cases minify
may need to handle:
For example:
#! /usr/bin/perl
use warnings;
use strict;
*IP_HANDLE = *DATA;
open OP_HANDLE, ">&STDOUT";
open my $fh, ">&STDOUT";
my $offset = tell DATA;
use JavaScript::Minifier;
my $obj = JavaScript::Minifier->new;
$obj->minify(*IP_HANDLE, "OP_HANDLE");
seek DATA, $offset, 0 or die "$0: seek: $!";
$obj->minify(\*IP_HANDLE, $fh);
__DATA__
Ahoy there
matey!
As a library author, being accomodative can be useful. To illustrate, the following stub of JavaScript::Minifier understands both old-fashioned and modern ways of passing filehandles.
package JavaScript::Minifier;
use warnings;
use strict;
sub new { bless {} => shift }
sub minify {
my($self,$in,$out) = @_;
for ($in, $out) {
no strict 'refs';
next if ref($_) || ref(\$_) eq "GLOB";
my $pkg = caller;
$_ = *{ $pkg . "::" . $_ }{IO};
}
while (<$in>) { print $out $_ }
}
1;
Output:
$ ./prog.pl Name "main::OP_HANDLE" used only once: possible typo at ./prog.pl line 7. Ahoy there matey! Ahoy there matey!
The *
refers to a Perl "typeglob", which is an obscure implementation detail of Perl. Some older Perl code needs to refer to file handles using typeglobs (since there wasn't any other way to do it at the time). More modern code can use filehandle references instead, which are easier to work with.
The *
is analogous to $
or %
, it refers to a different kind of object known by the same name.
From the perldata
documentation page:
Perl uses an internal type called a typeglob to hold an entire symbol table entry. The type prefix of a typeglob is a * , because it represents all types. This used to be the preferred way to pass arrays and hashes by reference into a function, but now that we have real references, this is seldom needed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With