Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl 5 - Iterator

I have implemented a simple iterator in perl. I normally work with C#, and use iterators and functional programming quite frequently. So I thought it would be simple to get some basics working in perl.

Problem is, I'm getting some poor performance, I don't expect be any faster than for or foreach, but I thought someone could give me some insight in how to speed it up.

Here is the guts of my package:

package Iterator;
use strict;

#Constructor for Iterator type
sub new {
  my $type = $_[0];
  my $this = {};

  #set default values
  $this->{Array} = @_[1];
  $this->{Index} = 0;
 $this->{Sub} = sub { 
   my @array = @{$this->{Array}};
   return $#array >= $this->{Index} ? $array[$this->{Index}++] : undef;
  };

  #return self
  bless($this, $type);
  return $this;
}

#Iterates next
sub Next {
 return $_[0]->{Sub}->();
}

Allows you to do this:

my $iterator = Iterator->new(\@array);
while (defined(my $current = $iterator->Next())) {
  print $current, "\n";
}

Not flashy... yet.

Also enables some functional code like this:

my $sum = 0;
Iterator
  ->new(\@array)
  ->Where(sub { $_[0] % 2 == 0 })
  ->ForEach(sub { $sum += $_[0] });

Which would sum up all the even values of an array.

My bottleneck is the iteration code:

$this->{Sub} = sub { 
   my @array = @{$this->{Array}};
   return $#array >= $this->{Index} ? $array[$this->{Index}++] : undef;
  };

Any pointers to speed this up?

like image 376
jonathanpeppers Avatar asked Jan 28 '11 14:01

jonathanpeppers


2 Answers

A bit late to the game here, but since you are concerned about performance, one of the largest bottlenecks in iterator type code is that the fields of your hash based object need to be dereferenced on each access. One way to combat this is to use closures in which the fields are closed over variables, avoiding unneeded dereferencing.

In my module List::Gen which contains a fairly performant implementation of lazy lists, I wrote the utility function curse which makes closure based objects behave like normal Perl objects.

Here is a short example of your iterator class written with curse. In a simple benchmark summing 1000 numbers, this method is twice as fast as yours, even after fixing all of the inefficiencies noted in the other answers.

{package Iterator;
    use List::Gen 'curse';
    sub new {
        my ($class, $array) = @_;
        my $index = 0;
        curse {
            next  => sub {$$array[$index++]},
            index => sub :lvalue {$index},
        } => $class
    }
    sub reset {shift->index = 0}
} 

If you are really pushing for more speed, since the next method does not need to be passed anything, you could even write:

my $next = $iterator->can('next');

while (defined (my $x = $next->()) {...}

Which will give you a 30% to 40% speed boost over a normal method call.

You can read the source of List::Gen for more advanced usage of curse

like image 110
Eric Strom Avatar answered Oct 03 '22 04:10

Eric Strom


You might find it useful to read a bit of Higher Order Perl.

like image 31
Dave Cross Avatar answered Oct 03 '22 05:10

Dave Cross