Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the meaning of {} in the grammar of Web::Scraper?

Tags:

perl

I am confused about the Perl grammar. This is an example about Web::Scraper.

my $t = scraper {
    process "li", "list[]" => "TEXT";
};
print ref($tweets), "\n";

Output:

Web::Scraper

I can't understand the meaning of the curly braces. If scraper is a function, then why use {} instead of ()?

like image 440
Gary Li Avatar asked Mar 04 '11 02:03

Gary Li


2 Answers

scraper is a subroutine defined with the (&) prototype:

sub scraper (&) {
    my $code_ref = shift;
    ...
    $code_ref->($some_value)
    ...
}

This prototype tells perl to parse the user subroutine scraper in a similar way as the builtin map {...} @list and grep {...} @list constructs. The code enclosed in the {...} brackets is passed into the function as a code reference, and is the same as if you wrote sub {...}. To write a map or grep-like subroutine, you would use the (&@) prototype, which tells perl to expect a code block and then a list.

In this case, the (&) prototype states that the function takes exactly one argument, with high precedence, and that the argument must be a bare block (interpreted as a coderef), a literal sub {...} declaration, or an expression preceded with a reference/dereference pair \&{some_expression}. If the expression is a simple scalar, you can write \&$code_ref

While the syntax you found scraper {...} is the shortest, you can also call it as scraper sub {...}.

If the value you need to pass to scraper is held in a variable, you can write:

scraper \&$code_ref; # where the \& portion asserts that the value is a coderef

The same syntax is used with named subroutines:

sub some_sub {...}

scraper \&some_sub;

You can learn more about Perl's subroutine options at perlsub.

Lastly, the requisite warning about prototypes. Prototypes are often misused by novice Perl programmers as a form of argument validation, similar to function signatures in some other languages. This usage is error prone due to the imposition of context (scalar vs list) that prototypes can specify. Argument checking is best done inside the subroutine while unpacking @_ if it is needed at all. Usage of prototypes should be reserved for cases such as scraper where the intent is to create a subroutine that is parsed in a similar way to one of perl's builtin functions.

like image 171
Eric Strom Avatar answered Nov 03 '22 01:11

Eric Strom


Web::Scraper::scraper is a function that takes another function (or function reference) as an argument. In this context, the { ... } declare an anonymous subroutine that is passed as an argument to the function (in other contexts, { ... } can declare a hash reference). Presumably, the function will cause the code you supply to be executed at some point (as the documentation suggests, when the scrape function is called).

There are alternative ways of calling a function like this that you may have seen before:

# reference to a named function
sub my_scraper_function { process "li", "list[]" => "TEXT" };
scraper \&my_scraper_function;

 

# reference to an anonymous function
my $scraper_function = sub { process "li", "list[]" => "TEXT" }; 
scraper $scraper_function;

 

# using a function name
# sometimes this doesn't work under 'use strict subs'
sub my_scraper_function { process "li", "list[]" => "TEXT" };
scraper 'ThisPackage::my_scraper_function';

You will also see this syntax of some of Perl's built in and other common functions, like map and grep:

@square_roots = map { sqrt($_) } 1 .. 100;
like image 40
mob Avatar answered Nov 03 '22 01:11

mob