Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does this Perl grep work to determine the union of several hashes?

Tags:

grep

perl

I don't understand the last line of this function from Programming Perl 3e.

Here's how you might write a function that does a kind of set intersection by returning a list of keys occurring in all the hashes passed to it:

@common = inter( \%foo, \%bar, \%joe );
sub inter {
    my %seen;
    for my $href (@_) {
        while (my $k = each %$href) {
            $seen{$k}++;
        }
    }
    return grep { $seen{$_} == @_ } keys %seen;
}

I understand that %seen is a hash which maps each key to the number of times it was encountered in any of the hashes provided to the function.

like image 672
titaniumdecoy Avatar asked Apr 15 '10 05:04

titaniumdecoy


3 Answers

grep will take a list passed to it (in this case, every element seen in any of the hashrefs); and return a list of only those elements where the expression in the block is true (locally setting $_ variable to each element in the list).

Let's look at how that expression is evaluated:

  • @_ is an array of all the parameters passed to the subroutine - in our case a list of hash references passed in.

  • In $seen{$_} == @_ expression that list is forced into a scalar context (due to ==).

  • When used in a scalar context, a list evaluates to the number of elements in a list - in the example call above, to 3, since 3 hashrefs were passed in.

So, for each key in %seen (e.g. each key seen in any of N hashrefs); the expression $seen{$_} == @_ is numerically comparing the # of times the element was seen in the hashes to the total number of hashes - it's only going to be equal, of course, if the element is in ALL the hashes that were passed in, and thus a member of the intersection we want.

So, to sum up the analysis, the grep will return a list of all keys that occur in EVERY hash (aka occur N times where N is the # of hashes). E.g. an intersection.

like image 85
DVK Avatar answered Oct 06 '22 07:10

DVK


grep block list 

This will apply block to each element of list in turn, the element is aliased as $_. If the block returns true, the element is added to the returned array.

in this case:

grep { $seen{$_} == @_ } keys %seen

The block is $seen{$_} == @_ , which compares the value of the seen hash against @_ . @_ is evaluated in scalar context and thus returns the number of elements in the @_ array. @_ represents the arguments to the current function. In this case ( \%foo, \%bar, \%joe ), which returns 3 in scalar context. Our list is keys %seen, which is an array containing all the keys present in %seen.

equivalent english statements:

  • "give me a list of all the keys from %seen where the value associated with that key is equal to the number of elements passed to this function"
  • "give me a list of all the keys from %seen where the value associated with that key is 3"
  • "give me a list of all the keys from %seen that have value 3, ie all the keys from %seen that are present in each of the 3 hashrefs passed to this function"
like image 42
spazm Avatar answered Oct 06 '22 07:10

spazm


The purpose of the function is to find the elements that appear in all the hashes passed to it.

The last line greps the list returned from keys %seen. To determine if a given key appears in all the hashes that were passed to the function, we can compare that key's value in %seen to the number of arguments to inter.

In the grep block, $_ is set to each element of the keys list, and tested for some condition.

An array in scalar context evaluates to its length. @_ is the array of arguments passed into the subroutine. And the == operator puts its operands in scalar context, so we can just compare the value of $seen{$_} to the length @_. If they're the same, then that key appeared in all the hashes.

like image 41
friedo Avatar answered Oct 06 '22 07:10

friedo