Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Perl's Archive::Tar run out of memory?

I am using below Perl code to list the files in tar archive. The size of the tar archive is always about 15MB.

my $file = shift;
my $tar = Archive::Tar->new("$file");
my @lists = $tar->list_files;
$tar->error unless @lists;

Executing this code gives me an error "Out of Memory". I have about 512MB in my Linux system and I dont want to increase the memory of my system. Can anyone please suggest me if this code can be modified for better performance or another code to list the files in tar archive.

like image 692
Space Avatar asked Nov 10 '09 09:11

Space


2 Answers

From Archive::Tar FAQ:

Isn't Archive::Tar slow? Yes it is. It's pure perl, so it's a lot slower then your /bin/tar However, it's very portable. If speed is an issue, consider using /bin/tar instead.

Isn't Archive::Tar heavier on memory than /bin/tar? Yes it is, see previous answer. Since Compress::Zlib and therefore IO::Zlib doesn't support seek on their filehandles, there is little choice but to read the archive into memory. This is ok if you want to do in-memory manipulation of the archive.

If you just want to extract, use the extract_archive class method instead. It will optimize and write to disk immediately.

Another option is to use the iter class method to iterate over the files in the tarball without reading them all in memory at once.

 

So based on above then this should be the solution (untested):

my $next = Archive::Tar->iter( $file );

while ( my $f = $next->() ) {
    say $f->name;
}

/I3az/

like image 67
draegtun Avatar answered Oct 20 '22 02:10

draegtun


I tried that on a large tar and got an error too. Probably a bug in libs. The following worked for me:

@files = split/\n/, `tar tf $file`
like image 40
catwalk Avatar answered Oct 20 '22 03:10

catwalk