I am confused about the Perl grammar. This is an example about Web::Scraper.
my $t = scraper {
process "li", "list[]" => "TEXT";
};
print ref($tweets), "\n";
Output:
Web::Scraper
I can't understand the meaning of the curly braces. If scraper
is a function, then why use {}
instead of ()
?
scraper
is a subroutine defined with the (&)
prototype:
sub scraper (&) {
my $code_ref = shift;
...
$code_ref->($some_value)
...
}
This prototype tells perl to parse the user subroutine scraper
in a similar way as the builtin map {...} @list
and grep {...} @list
constructs. The code enclosed in the {...}
brackets is passed into the function as a code reference, and is the same as if you wrote sub {...}
. To write a map
or grep
-like subroutine, you would use the (&@)
prototype, which tells perl to expect a code block and then a list.
In this case, the (&)
prototype states that the function takes exactly one argument, with high precedence, and that the argument must be a bare block (interpreted as a coderef), a literal sub {...}
declaration, or an expression preceded with a reference/dereference pair \&{some_expression}
. If the expression is a simple scalar, you can write \&$code_ref
While the syntax you found scraper {...}
is the shortest, you can also call it as scraper sub {...}
.
If the value you need to pass to scraper
is held in a variable, you can write:
scraper \&$code_ref; # where the \& portion asserts that the value is a coderef
The same syntax is used with named subroutines:
sub some_sub {...}
scraper \&some_sub;
You can learn more about Perl's subroutine options at perlsub.
Lastly, the requisite warning about prototypes. Prototypes are often misused by novice Perl programmers as a form of argument validation, similar to function signatures in some other languages. This usage is error prone due to the imposition of context (scalar vs list) that prototypes can specify. Argument checking is best done inside the subroutine while unpacking @_
if it is needed at all. Usage of prototypes should be reserved for cases such as scraper
where the intent is to create a subroutine that is parsed in a similar way to one of perl's builtin functions.
Web::Scraper::scraper
is a function that takes another function (or function reference) as an argument. In this context, the { ... }
declare an anonymous subroutine that is passed as an argument to the function (in other contexts, { ... }
can declare a hash reference). Presumably, the function will cause the code you supply to be executed at some point (as the documentation suggests, when the scrape
function is called).
There are alternative ways of calling a function like this that you may have seen before:
# reference to a named function
sub my_scraper_function { process "li", "list[]" => "TEXT" };
scraper \&my_scraper_function;
# reference to an anonymous function
my $scraper_function = sub { process "li", "list[]" => "TEXT" };
scraper $scraper_function;
# using a function name
# sometimes this doesn't work under 'use strict subs'
sub my_scraper_function { process "li", "list[]" => "TEXT" };
scraper 'ThisPackage::my_scraper_function';
You will also see this syntax of some of Perl's built in and other common functions, like map
and grep
:
@square_roots = map { sqrt($_) } 1 .. 100;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With