Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

perl save a file downloaded by lwp

Im using LWP to download an executable file type and with the response in memory, i am able to hash the file. However how can i save this file on my system? I think i'm on the wrong track with what i'm trying below. The download is successful as i am able to generate the hash correctly (I've double checked it by downloading the actual file and comparing the hashes).

use strict;
use warnings;
use LWP::Useragent;
use Digest::MD5    qw( md5_hex );
use Digest::MD5::File qw( file_md5_hex );
use File::Fetch;

my $url = 'http://www.karenware.com/progs/pthasher-setup.exe';
my $filename = $url;
$filename =~ m/.*\/(.*)$/;
$filename = $1;
my $dir ='/download/two';
print "$filename\n";

my $ua = LWP::UserAgent->new();
my $response = $ua->get($url);
die $response->status_line if !$response->is_success;
my $file = $response->decoded_content( charset => 'none' );
my $md5_hex = md5_hex($file);
print "$md5_hex\n";
my $save = "Downloaded/$filename";
    unless(open SAVE, '>>'.$save) {
        die "\nCannot create save file '$save'\n";
    }
    print SAVE $file;
    close SAVE;

If you are wondering why do i not instead download everything then parse the folder for each file and hash, its because im downloading all these files in a loop. And during each loop, i upload the relevant source URL (where this file was found) , along with the file name and hash into a database at one go.

like image 552
Marcus Lim Avatar asked Dec 06 '12 09:12

Marcus Lim


2 Answers

Try getstore() from LWP::Simple

use strict;
use warnings;
use LWP::Simple qw(getstore);
use LWP::UserAgent;
use Digest::MD5    qw( md5_hex );
use Digest::MD5::File qw( file_md5_hex );
use File::Fetch;

my $url = 'http://www.karenware.com/progs/pthasher-setup.exe';
my $filename = $url;
$filename =~ m/.*\/(.*)$/;
$filename = $1;
my $dir ='/download/two';
print "$filename\n";

my $ua = LWP::UserAgent->new();
my $response = $ua->get($url);
die $response->status_line if !$response->is_success;
my $file = $response->decoded_content( charset => 'none' );
my $md5_hex = md5_hex($file);
print "$md5_hex\n";
my $save = "Downloaded/$filename";
getstore($url,$save);
like image 160
Demnogonis Avatar answered Oct 18 '22 20:10

Demnogonis


getstore is an excellent solution, however for anyone else reading this response in a slightly different setup, it may not solve the issue.

First of all, you could quite possibly just be suffering from a binary/text issue.

I'd change

my $save = "Downloaded/$filename";
unless(open SAVE, '>>'.$save) {
    die "\nCannot create save file '$save'\n";
}
print SAVE $file;
close SAVE;

to

my $save = "Downloaded/$filename";
open my $fh, '>>', $save or die "\nCannot create save file '$save' because $!\n";
# on platforms where this matters
# (like Windows) this is needed for 
# 'binary' files:
binmode $fh;
print $fh $file;
close $fh;

The reason I like this better is that if you have set or acquired some settings on your browser object ($ua), they are ignored in LWP::Simple's getstore, as it uses its own browser.

Also, it uses the three parameter version of open which should be safer.

Another solution would be to use the callback method and store the file while you are downloading it, if for example you are dealing with a large file. The hashing algorithm would have to be changed so it is probably not relevant here but here's a sample:

my $req = HTTP::Request->new(GET => $uri);
open(my $fh, '>', $filename) or die "Could not write to '$filename': $!";
binmode $fh;
$res = $ua->request($req, sub {
    my ($data, $response, $protocol) = @_;
    print $fh $data;
});
close $fh;

And if the size is unimportant (and the hashing is done some other way) you could just ask your browser to store it directly:

my $req = HTTP::Request->new(GET => $uri);
$res = $ua->request($req, $filename);
like image 44
Henning Michael Møller Just Avatar answered Oct 18 '22 20:10

Henning Michael Møller Just