Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should I process HTML META tags with Mojo::UserAgent?

I have to play with some misconfigured web servers, so I started processing the HTML meta tags to feed information back into the web user-agent object. I tried a variety of ways of doing this in Mojolicious and settled on a looking for a "finish" event on the response. My goal was to make this mostly invisible to the rest of the code so the process wasn't even aware this was happening.

Still, this just doesn't sit right with me for a reason I can't quite put my finger on. Aside from the particular code in process_meta_options, is there a more Mojolicious way to do this? For example, Mojo::UserAgent get() with userdefined callback uses the read event, but I tend to think that might interfere with things. Or I could just be over-thinking it.

use v5.20;

use feature qw(signatures);
no warnings qw(experimental::signatures);

use Data::Dumper;
use Mojo::UserAgent;

my $ua = Mojo::UserAgent->new;

my $tx = $ua->build_tx( GET => 'http://blogs.perl.org' ); 

$tx->res->on(
    finish => \&process_meta_options
    );

$tx = $ua->start( $tx );
say "At end, charset is ", $tx->res->content->charset;

sub process_meta_options ( $res ) {
    $res
        ->dom
        ->find( 'head meta[charset]' )  # HTML 5
        ->map( sub {
            my $content_type = $res->headers->header( 'Content-type' );
            return unless my $meta_charset = $_->{charset};
            $content_type =~ s/;.*//;
            $res->headers->header( 'Content-type', "$content_type; charset=$_->{charset}" );
            } );
    }
like image 954
brian d foy Avatar asked Nov 10 '22 05:11

brian d foy


1 Answers

I think the answer is just what I came up with. I haven't found anything that I liked better.

use v5.20;

use feature qw(signatures);
no warnings qw(experimental::signatures);

use Data::Dumper;
use Mojo::UserAgent;

my $ua = Mojo::UserAgent->new;

my $tx = $ua->build_tx( GET => 'http://blogs.perl.org' ); 

$tx->res->on(
    finish => \&process_meta_options
    );

$tx = $ua->start( $tx );
say "At end, charset is ", $tx->res->content->charset;

sub process_meta_options ( $res ) {
    $res
        ->dom
        ->find( 'head meta[charset]' )  # HTML 5
        ->map( sub {
            my $content_type = $res->headers->header( 'Content-type' );
            return unless my $meta_charset = $_->{charset};
            $content_type =~ s/;.*//;
            $res->headers->header( 'Content-type', "$content_type; charset=$_->{charset}" );
            } );
    }
like image 141
brian d foy Avatar answered Nov 15 '22 06:11

brian d foy