Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to understand this perl script

Tags:

perl

analysis

It seems very simple and I figured most of it out. But seeing as perl is loose with syntax, it's difficult for a new comer to jump right in :)

my @unique = ();
my %seen   = ();
foreach my $elem ( @array ) {
    next if $seen{ $elem }++;
    push @unique, $elem;
}

This is right from the perldoc website. If I understand correctly, it can also be written as:

my @unique = ();
my %seen   = ();
my $elem;
foreach $elem ( @array ) {
    if ( $seen{ $elem }++ ) {
        next;
    }
    push ( @unique, $elem );
}

So my understanding at this point is:

  • Declare an array named unique
  • Declare a hash named seen
  • Declare a variable named elem
  • Iterate over @array, each iteration is stored in $elem
  • If $elem is a key in the hash %seen (I have no idea what the ++ does), skip to the next iteration
  • Append $elem to the end of @unique

I'm missing 2 things:

  • When does anything get stored in %seen?
  • What does ++ do (in every other language it increments, but I dont see how that works)

I know that the issue lies with this part:

$seen{ $elem }++

which I suspect is doing a bunch of different stuff at once. Is there a simpler more verbose way of writing that line?

Thanks for the help


1 Answers

The ++ operator does essentially the same thing in Perl as it does in most other languages that have it: it increments a variable.

$seen{ $elem }++;

increments a value in the %seen has, namely the one whose key is $elem.

The "magic" is that if $seen{$elem} hasn't been defined yet, it's automatically created, as if it already existed and had the value 0; the ++ then sets it to 1. So it's equivalent to:

if (! exists $seen{$elem}) {
    $seen{$elem} = 0;
}
$seen{$elem} ++;

This is called "autovivification". (No, really, that's what it's called.) (EDIT2: No, my mistake, it's not; as @ysth points out, "autovification" actually refers to references springing into existence. See perldoc perlref.)

EDIT: Here's a revised version of your description:

  • Declare an array variable named @unique
  • Declare a hash variable named %seen
  • Declare a scalar variable named $elem
  • Iterate over @array, each iteration is stored in $elem
  • If $elem is a key in the hash %seen, skip to the next iteration
  • Append the value of $elem to the end of @unique

@unique, %seen, and $elem are all variables. The punctuation character (known as the "sigil" indicates what kind of variable each of them is, and is best thought of as part of the name.

like image 70
Keith Thompson Avatar answered Feb 26 '26 11:02

Keith Thompson



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!