Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading a specific line from large file in Perl

Tags:

file

perl

line

Is there any fast and memory efficient way to read specific lines of large file, without loading it to memory?

I wrote a perl script, that runs many forks and I would like them to read specific lines from a file.

At the moment Im using an external command:

sub getFileLine {
    my ( $filePath, $lineWanted ) = @_;
    $SIG{PIPE} = '_IGNORE_';
    open( my $fh, '-|:utf8', "tail -q -n +$lineWanted \"$filePath\" | head -n 1" );
    my $line = <$fh>;
    close $fh;
    chomp( $line );
    return $line;
}

Its fast and it works - but maybe there's a more "Perl-ish" way, as fast and as memory efficient as this one?

As you know, creating a fork process in Perl duplicates the main process memory - so if the main process is using 10MB, the fork will use at least that much.

My goal is to keep fork process (so main process until running forks also) memory use as low as possible. Thats why I dont want to load the whole file into memory.

like image 939
gib Avatar asked Dec 18 '11 10:12

gib


People also ask

How do I print a specific line in a file in Perl?

With the -p switch, Perl wraps a while loop around the code you specify with -e, and -i turns on in-place editing. The current line is in $. With -p, Perl automatically prints the value of $ at the end of the loop.


2 Answers

Before you go further, it's important to understand how fork works. When you fork a process, the OS uses copy-on-write semantics to share the bulk of the parent and child processes' memory; only the amount of memory that differs between the parent and child need to be separately allocated.

For reading a single line of a file in Perl, here's a simple way:

open my $fh, '<', $filePath or die "$filePath: $!";
my $line;
while( <$fh> ) {
    if( $. == $lineWanted ) { 
        $line = $_;
        last;
    }
}

This uses the special $. variable which holds the line number of the current filehandle.

like image 88
friedo Avatar answered Oct 05 '22 17:10

friedo


Take a look at Tie::File core module.

like image 35
cirne100 Avatar answered Oct 05 '22 17:10

cirne100