At first, I simply wondered why there was a colon before the empty braces in line 2 of this code (from the Perl 6 Advent Calendar, December 25, 2018)?
sub is-happy( $n is copy ) {
my $seen-numbers = :{};
while $n > 1 {
return False if $n ∈ $seen-numbers;
$seen-numbers{$n} = True;
$n = $n.comb.map(*²).sum
}
return True;
}
say is-happy(7); # True
say is-happy(2018); # False
This code runs practically instantly. I tried say $seen-numbers.^name;
and found that it was Hash[Mu,Any]
.
However, when I remove the colon, is-happy(7)
returns True
, but then is-happy(2018)
tied up the CPU for several minutes (until I killed the process).
Also, in this case, say $seen-numbers.^name;
prints Hash
.
So, obviously, :{}
results in creating a Hash[Mu,Any]
. How does that work? Is this a work-around or is idiomatic? And what is a Hash[Mu,Any]
and how does it compare to a normal Hash
?
A Hash
has Str
keys. Any key that is not a Str
will be coerced into one before storing the value under that string key. The {}
constructs an empty Hash
.
This behavior can be changed by giving Hash
type arguments. A Hash[Int]
, for example, still has the auto-coercing string key, but can only store Int
values (much like an Array[Int]
can only store Int
values). It is also possible to pass a second type argument to pick the type of the key. This creates what is often referred to as an "object hash", and it stores the exact keys that are provided in the indexer. So a Hash[Mu,Any]
is a hash with unconstrained values and Any
kind of key (nearly all types fall under Any
, with the exception of Junction
). The .WHICH
of the key is used for hashing.
Most of the time, one doesn't create object hashes using Hash[TValue,TKey]
, but instead using a declaration syntax; for example, I might represent edges in a DAG using something like has Array %!edges-from{Mu}
. There is also a way to create an anonymous object hash, :{}
, which is what you observe in the example.
So, what difference does it make? Here:
$seen-numbers{$n} = True;
The $n
will be stored as an Int
, its .WHICH
used for hashing to get the O(1) lookup. If we'd just used {}
instead, we'd have had the Int
being coerced into a Str
for the key. That difference becomes important here:
return False if $n ∈ $seen-numbers;
Because the set membership operator looks for values to be identical. So when we have an object hash, :{}
, storing the Int
s, then $n
might be identical to one of the keys in the hash; if it is, the algorithm terminates. However, with a normal hash, created with {}
, the keys are Str
, so never identical to the Int
, and so the set membership check will always fail, causing non-termination of the algorithm.
The program is consistent with its own decisions. One could write:
return False if $seen-numbers{$n}:exists;
And then it will work with the {}
(Hash
) also. However, it will be coercing the numbers into Str
for storage and lookup every single time, a cost avoided with :{}
. So, we can say that the program has made a more efficient data structure choice by using :{}
, and that in turn enabled the use of ∈
, which could be considered more readable (especially if the target audience is folks used to reading such mathematical operators).
Another way to write the program that may have been clearer is:
my %seen-numbers is SetHash;
while $n > 1 {
return False if $n ∈ %seen-numbers;
%seen-numbers{$n} = True;
$n = $n.comb.map(*²).sum
}
return True;
Which makes clearer that we're thinking of %seen-numbers
as a set, although in this case the name leaves relatively little question about its purpose.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With