Synopsis 2 says:
An identifier is composed of an alphabetic character followed by any sequence of alphanumeric characters. The definitions of alphabetic and numeric include appropriate Unicode characters. Underscore is always considered alphabetic. An identifier may also contain isolated apostrophes or hyphens provided the next character is alphabetic.
Syntax in the Perl 6 docs says:
Identifiers are a grammatical building block that occur in several places. An identifier is a primitive name, and must start with an alphabetic character (or an underscore), followed by zero or more word characters (alphabetic, underscore or number). You can also embed dashes - or single quotes ' in the middle, but not two in a row.
The term "appropriate Unicode character" begs the question that we know what appropriate is.
I find that to be too vague if I'm going to choose beyond ASCII characters. I find in Perl6::Grammar this production, but not the definition for <.ident>
:
token identifier {
<.ident> [ <.apostrophe> <.ident> ]*
}
But this also begs the question that you have to know what an identifier is to define an identifier. So, where is <.ident>
?
raiph points out that <.ident>
is the ident
method in QRegex::Cursor
, but that defines it in terms of nqp::const::CCLASS_WORD
. Now I have to track down that.
I tried to use U+00B2 (SUPERSCRIPT TWO) (General categories No, Other_Number) because I wanted to pass around the result of an expensive square operation, and hey, Perl 6 is supposed to allow this:
my $a² = $a**2;
But, it turns out that ², along with the other superscripts, are operators. That's fine, but ² and the like aren't listed as an operator or in Int
or the behavior Int
inherits:
$ perl6 -e 'my $Δ² = 6; say $*PERL; say $Δ²'
Use of uninitialized value of type Any in numeric context in block <unit> at -e line 1
Cannot modify an immutable Int
in block <unit> at -e line 1
$ perl6 -e 'my $Δ = 6; say $*PERL; say $Δ²'
Perl 6 (6.c)
36
$ perl6 -e 'my $Δ = 6; say $*PERL; say $Δ³'
Perl 6 (6.c)
216
$ perl6 -e 'my $Δ = 6; say $*PERL; say $Δ⁹'
Perl 6 (6.c)
10077696
But I can't use ½ U+00BD (VULGAR FRACTION ONE HALF) (General categories of No and Other_Number):
$ perl6 -e 'my $Δ½ = 6; say $*PERL; say $Δ½'
===SORRY!=== Error while compiling -e
Bogus postfix
at -e:1
------> my $Δ⏏½ = 6; say $*PERL; say $Δ½
expecting any of:
constraint
infix
infix stopper
postfix
statement end
statement modifier
statement modifier loop
But, what if I don't put a number in $Δ
?
$ perl6 -e 'my $Δ = "foo"; say $*PERL; say $Δ²'
Cannot convert string to number: base-10 number must begin with valid digits or '.' in '⏏foo' (indicated by ⏏)
in block at -e line 1
Actually thrown at:
in block at -e line 1
I was worried that someone defining a postfix operator could break the language, but this seems to work:
$ perl6 -e 'multi sub postfix:<Δ>(Int $n) { 137 }; say 6Δ;'
137
$ perl6 -e 'multi sub postfix:<Δ>(Int $n) { 137 }; my $ΔΔ = 6; say $ΔΔ;'
6
$ perl6 -e 'multi sub postfix:<Δ>(Int $n) { 137 }; my $Δ = 6; say $ΔΔ;'===SORRY!=== Error while compiling -e
Variable '$ΔΔ' is not declared
at -e:1
------> fix:<Δ>(Int $n) { 137 }; my $Δ = 6; say ⏏$ΔΔ;
So, what's going on there?
The Perl 6 solution is to allow multi-method dispatch, which not only removes conceptual complexity (at least, MMD is easier to explain than tie ) but also provides the possibility of a cleaner implementation.
Raku is a member of the Perl family of programming languages. Formerly known as Perl 6, it was renamed in October 2019. Raku introduces elements of many modern and historical languages.
Released on December 24, 2015, Perl 6 Version 1.0 is also Perl 6.
The grammar has an identifer defined as
token apostrophe {
<[ ' \- ]>
}
token identifier {
<.ident> [ <.apostrophe> <.ident> ]*
}
with ident
a method on cursors which accepts input that starts with a CCLASS_ALPHABETIC
character or an underscore _
and continues with zero or more CCLASS_WORD
characters.
These classes are implemented in MoarVM and map to various Unicode categories.
Specifically, CCLASS_ALPHABETIC
checks for Letter, Lowercase; Letter, Uppercase; Letter, Titlecase; Letter, Modifier and Letter, Other.
CCLASS_WORD
additionally accepts characters of category Number, Decimal Digit as well as undercores.
As to why postfix operators do not break identifiers, that's due to longest token matching.
If you want to call a postfix operator Δ
on a variable $Δ
, you have to add a backslash, ie
multi sub postfix:<Δ>(Int $n) { 137 };
my $Δ = 6;
say $Δ\Δ;
or an 'unspace'
say $Δ\ Δ;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With