Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Perl, why are tied arrays so slow?

In my testing, I have noticed that iterating over tied arrays are at best half as fast as using the internal accessor methods (FETCH and FETCHSIZE). The following benchmark shows the issue:

{package Array;
    sub new {
        my $class = shift;
        tie my @array, $class, [@_];
        \@array
    }
    sub TIEARRAY {
        my ($class, $self) = @_;
        bless $self => $class;
    }
    sub FETCH     {$_[0][$_[1]]}
    sub FETCHSIZE {scalar @{$_[0]}}
}

use List::Util 'sum';
use Benchmark 'cmpthese';

for my $mag (map 10**$_ => 1 .. 5) {

    my $array = Array->new(1 .. $mag);
    my $tied  = tied(@$array);
    my $sum   = sum @$array;

    print "$mag: \n";
    cmpthese -2 => {
        tied => sub {
            my $x = 0;
            $x += $_ for @$array;
            $x == $sum or die $x
        },
        method => sub {
            my $x = 0;
            $x += $tied->FETCH($_) for 0 .. $tied->FETCHSIZE - 1;
            $x == $sum or die $x
        },
        method_while => sub {
            my $x = 0;
            my $i = 0; $x += $tied->FETCH($i++) while $i < $tied->FETCHSIZE;
            $x == $sum or die $x
        },
        method_while2 => sub {
            my $x = 0;
            my $i = 0;
            $x += tied(@$array)->FETCH($i++) 
                while $i < tied(@$array)->FETCHSIZE;
            $x == $sum or die $x
        },
        method_while3 => sub {
            my $x = 0;
            my $i = 0;
            while ($i < tied(@$array)->FETCHSIZE) {
                local *_ = \(tied(@$array)->FETCH($i++));
                $x += $_
            }
            $x == $sum or die $x
        },
    };
    print "\n";
}

With an array size of 1000, the benchmark returns:

1000: 
                Rate   tied method_while3 method_while2 method_while   method
tied           439/s     --          -40%          -51%         -61%     -79%
method_while3  728/s    66%            --          -19%         -35%     -65%
method_while2  900/s   105%           24%            --         -19%     -57%
method_while  1114/s   154%           53%           24%           --     -47%
method        2088/s   375%          187%          132%          87%       --

I have omitted the other runs because the size of the array does not produce a meaningful change in the relative speeds.

method is of course the fastest, because it does not check the size of the array on each iteration, however method_while and method_while2 seem to be operating on the tied array in the same manner as the for loop, yet even the slower method_while2 is twice as fast as the tied array.

Even adding localization of $_ and aliased assignment to method_while2 in method_while3 results in 66% faster execution than the tied array.

What extra work is happening in the for loop that is not happening in method_while3? Is there any way to improve the speed of the tied array?

like image 967
Eric Strom Avatar asked Apr 01 '11 19:04

Eric Strom


1 Answers

Every time you use the tied array, it has to look up the tied object, then look up the methods, then invoke them. With your other versions, you are doing some or all of that lookup either at compile time or once before the loop instead of on every access.

(The comparison of speeds between method and the other method_* versions is a good example of this, by the way: you're seeing the expense of doing that FETCHSIZE, even having the tied object already looked up. Now apply that cost to every single operation that touches the array.)

like image 105
geekosaur Avatar answered Sep 28 '22 16:09

geekosaur