Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I accept gzip-compressed content using LWP::UserAgent?

I am fetching some pages over the Web using Perl's LWP::UserAgent and would like to be as polite as possible. By default, LWP::UserAgent does not seamlessly handle compressed content via gzip. Is there an easy way to make it do so, to save everyone some bandwidth?

like image 642
Ryan Tate Avatar asked Aug 16 '09 20:08

Ryan Tate


1 Answers

LWP has this capability built in, thanks to HTTP::Message. But it's a bit hidden.

First make sure you have Compress::Zlib installed so you can handle gzip. HTTP::Message::decodable() will output a list of allowed encodings based on the modules you have installed; in scalar context, this output takes the form a comma-delineated string that you can use with the 'Accept-Encoding' HTTP header, which LWP requires you to add to your HTTP::Request-s yourself. (On my system, with Compress::Zlib installed, the list is "gzip, x-gzip, deflate".)

When your HTTP::Response comes back, be sure to access the content with $response->decoded_content instead of $response->content.

In LWP::UserAgent, it all comes together like this:

my $ua = LWP::UserAgent->new; my $can_accept = HTTP::Message::decodable; my $response = $ua->get('http://stackoverflow.com/feeds',      'Accept-Encoding' => $can_accept, ); print $response->decoded_content; 

This will also decode text to Perl's unicode strings. If you only want LWP to uncompress the response, and not mess with the text, do like so:

print $response->decoded_content(charset => 'none'); 
like image 63
Ryan Tate Avatar answered Oct 31 '22 15:10

Ryan Tate