Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does :{} create a "Hash[Mu,Any]" object (and what is it and how does it compare to a normal "Hash")?

Tags:

raku

At first, I simply wondered why there was a colon before the empty braces in line 2 of this code (from the Perl 6 Advent Calendar, December 25, 2018)?

sub is-happy( $n is copy ) {
    my $seen-numbers = :{};
    while $n > 1 {
        return False if $n ∈ $seen-numbers;
        $seen-numbers{$n} = True;
        $n = $n.comb.map(*²).sum
    }
    return True;
}

say is-happy(7);     # True
say is-happy(2018);  # False

This code runs practically instantly. I tried say $seen-numbers.^name; and found that it was Hash[Mu,Any].

However, when I remove the colon, is-happy(7) returns True, but then is-happy(2018) tied up the CPU for several minutes (until I killed the process). Also, in this case, say $seen-numbers.^name; prints Hash.

So, obviously, :{} results in creating a Hash[Mu,Any]. How does that work? Is this a work-around or is idiomatic? And what is a Hash[Mu,Any] and how does it compare to a normal Hash?

like image 683
Christopher Bottoms Avatar asked Dec 27 '18 15:12

Christopher Bottoms


1 Answers

A Hash has Str keys. Any key that is not a Str will be coerced into one before storing the value under that string key. The {} constructs an empty Hash.

This behavior can be changed by giving Hash type arguments. A Hash[Int], for example, still has the auto-coercing string key, but can only store Int values (much like an Array[Int] can only store Int values). It is also possible to pass a second type argument to pick the type of the key. This creates what is often referred to as an "object hash", and it stores the exact keys that are provided in the indexer. So a Hash[Mu,Any] is a hash with unconstrained values and Any kind of key (nearly all types fall under Any, with the exception of Junction). The .WHICH of the key is used for hashing.

Most of the time, one doesn't create object hashes using Hash[TValue,TKey], but instead using a declaration syntax; for example, I might represent edges in a DAG using something like has Array %!edges-from{Mu}. There is also a way to create an anonymous object hash, :{}, which is what you observe in the example.

So, what difference does it make? Here:

$seen-numbers{$n} = True;

The $n will be stored as an Int, its .WHICH used for hashing to get the O(1) lookup. If we'd just used {} instead, we'd have had the Int being coerced into a Str for the key. That difference becomes important here:

return False if $n ∈ $seen-numbers;

Because the set membership operator looks for values to be identical. So when we have an object hash, :{}, storing the Ints, then $n might be identical to one of the keys in the hash; if it is, the algorithm terminates. However, with a normal hash, created with {}, the keys are Str, so never identical to the Int, and so the set membership check will always fail, causing non-termination of the algorithm.

The program is consistent with its own decisions. One could write:

return False if $seen-numbers{$n}:exists;

And then it will work with the {} (Hash) also. However, it will be coercing the numbers into Str for storage and lookup every single time, a cost avoided with :{}. So, we can say that the program has made a more efficient data structure choice by using :{}, and that in turn enabled the use of , which could be considered more readable (especially if the target audience is folks used to reading such mathematical operators).

Another way to write the program that may have been clearer is:

my %seen-numbers is SetHash;
while $n > 1 {
    return False if $n ∈ %seen-numbers;
    %seen-numbers{$n} = True;
    $n = $n.comb.map(*²).sum
}
return True;

Which makes clearer that we're thinking of %seen-numbers as a set, although in this case the name leaves relatively little question about its purpose.

like image 83
Jonathan Worthington Avatar answered Nov 08 '22 08:11

Jonathan Worthington