Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl memory usage with map and file handle

Does calling map { function($_) } <FILEHANDLE>; load the entire file into memory when using perl?

like image 999
Eric Pruitt Avatar asked May 22 '11 14:05

Eric Pruitt


3 Answers

Yes -- or at least that's how I interpret this outcome.

$ perl -e "map {0} <>" big_data_file
Out of memory!

$ perl -e "map {0} 1 .. 1000000000"
Out of memory!

One might wonder whether we run out of memory because Perl is trying to store the output of map. However, my understanding is that Perl is optimized to avoid that work whenever map is called in a void context. For a specific example, see the discussion in this question.

Perhaps a better example:

$ perl -e "sub nothing {}  map nothing(), <>" big_data_file
Out of memory!

Based on the comments, it appears that the question is motivated by a desire for a compact syntax when processing large data.

open(my $handle, '<', 'big_data_file') or die $!;

# An ordinary while loop to process a data file.
while (my $line = <$handle>){
    foo($line);
}

# Here Perl assigns each line to $_.
while (<$handle>){
    foo($_);
}

# And here we do the same thing on one line.
foo($_) while <$handle>;
like image 197
FMc Avatar answered Nov 04 '22 03:11

FMc


Yes, the operands for map, foreach loop and sub calls are evaluated before map, the foreach loop or the sub call even begins.

One exception:

for my $i (EXPR_X..EXPR_Y)

(with or without my $i) is optimised into a counting loop, something along the lines of

my $x = EXPR_X;
my $y = EXPR_Y;
for (my $i = $x; $i <= $y; ++$i)

Perl6 will have native support for lazy lists.

like image 39
ikegami Avatar answered Nov 04 '22 05:11

ikegami


The question you are asking I assume is this: Does the map function slurp the file before it begins processing, or does it use line by line.

Lets do a quick comparison about handling lists:

while (<FILEHANDLE>) { ... }

This case clearly uses line by line. Each iteration, a new value for $_ is fetched.

for my $line (<FILEHANDLE>) { ... }

In this case, the LIST is expanded before the loop starts. In http://perldoc.perl.org/functions/map.html there is a reference to map being similar to a foreach loop, and I do believe that LISTs are expanded before being passed to a function.

like image 2
TLP Avatar answered Nov 04 '22 05:11

TLP