Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Specify multiple classes in HTML::Element's look_down routine Perl?

I am using HTML::TreeBuilder to parse some HTML.

Can you specify multiple classes in the 'look_down' routine?

For in stance when searching through HTML using-

for ( $tree->look_down( 'class' => 'postbody'))

I also was to search for an additional class 'postprofile' in the same loop.

Is there a way of doing this without having to use a new -for ( $tree->look_down( 'class' => 'postprofile' ))

As this brings back 2 sets of results whereas I only want one merged set.

I tried using - for ( $tree->look_down( 'class' => 'postbody||postprofile')) However this did not work,

Thank you in advance.

like image 582
Ebikeneser Avatar asked Jul 13 '11 10:07

Ebikeneser


2 Answers

Try using a pattern instead of a string, i.e.,

$tree->look_down( 'class' => qr/^(?:postbody|postprofile)$/)
like image 69
Stuart Watt Avatar answered Sep 20 '22 19:09

Stuart Watt


Jambo, I am not trying to be rude, but please read the manual. I added links to your question.

I am going to assume that you did not read the docs because you were unable to find them. Let's address that issue:

How to Find the Docs You Need

Online:

  • search.cpan.org is a main website used to search for CPAN modules and their documentation. Many things can be found there.

  • perldoc.perl.org has the complete shipping documentation online for several recent versions of Perl.

Command Line:

  • perldoc shows a table of contents listing different sections of documentation you can peruse.

  • perldoc -f function is a quick way to search perlfunc and see the information on only one function. This is a super handy quick reference.

  • perldoc Module::Name::Here will show you a module's documentation.

  • perldoc perlpod is a sample of reading a section of the docs, in this case the article on POD formatting.

Which thing do I read?

All this is great, but how do you know where to look? I mean, I've got this thing called "look_down" that I am using. Where are the docs?

In this case, you can see that "look_down" is always called like this $somevar->look_down(blarg). Find where $somevar comes from. What kind of object is it? Worst case, you found that it is the result of some other call, now you have to find the docs for THAT call and see what is returned. But the steps are the same. Recursively push on through. Eventually you'll get to my $tree = HTML::TreeBuilder->new_from_content() or something like that. Now you can read the new_from_content docs in HTML::TreeBuilder. Hey, we get a HTML::Tree object that is a subclass of HTML::Element! So we check both classes. Whoah, look_down is in HTML::Element.

This is a little trickier if you have routines that are imported from other modules. Hopefully the author of your code was considerate enough to explicitly list where his routines come from:

use Some::Module qw( useful_sub  confusing_sub );

This means that useful_sub and confusing_sub come from Some::Module;

If you are unlucky your author wrote only use Some::Module; which means you get all the default exports. Which means you need to read the docs to find out what was imported.

For maintainability's sake, you can reduce this nightmare by always specifying exactly what routines you import from a function. If you want to import NOTHING, you can specify that as: use Some::Module ();

When looking for plain sub-names, it helps to remember that they may be actual functions. So don't forget to search perldoc.

In closing, I hope you find this useful. R-ing TFM is an amazingly powerful technique, and learning how to find relevant docs is the hidden skill that unlocks the power. Perl has a ton of docs to wade through, and it can be intimidating when you don't know where to look.

like image 37
daotoad Avatar answered Sep 18 '22 19:09

daotoad