Regexes are actually Methods:
say rx/foo/.^mro # ((Regex) (Method) (Routine) (Block) (Code) (Any) (Mu))
In that case, it means that they can act on self and are part of a class. What would that class be? My hunch is that it's the Match class and that they are actually acting on $/ (which they actually are). Any other way of formulating this?
Ultimately, all regexes expect to receive an invocant of type Match
or some subclass of Match
. In Perl 6, an invocant is simply the first argument, and is not special in any other way.
Those regexes declared with rule
, token
or regex
within a package will be installed as methods on that package. Most typically, they are declared in a grammar
, which is nothing more than a class
whose default parent is Grammar
rather than Any
. Grammar
is a sub-type of Match
.
grammar G {}.^mro.say # ((G) (Grammar) (Match) (Capture) (Cool) (Any) (Mu))
It's thus quite natural to see these as just methods, but with a body written in a different language. In fact, that's precisely what they are.
It's a little harder to see how the anonymous regexes are methods, in so far as they don't get installed in the method table of any type. However, if we were to write:
class C {
method foo() { 42 }
}
my $m = anon method () { self.foo }
say C.$m()
Then we see that we can resolve symbols on the invocant through self
, even though this method is not actually installed on the class C
. It's the same with anonymous regexes. The reason this matters is that assertions like <ident>
, <.ws>
, <?before foo>
and friends are actually compiled into method calls.
Thus, anonymous regexes being methods, and thus treating their first argument as an invocant, is what allows the various builtin rules, which are declared on Match
, to be resolved.
A method does not have to correspond with any class:
my method bar () { say self, '!' }
bar 'Hello World'; # Hello World!
my regex baz { :ignorecase 'hello world' }
'Hello World' ~~ /<baz>/;
'Hello World' ~~ &baz;
&baz.ACCEPTS('Hello World'); # same as previous line
# baz 'Hello World';
By default methods, and by extension regexes have a has
relationship with whatever class they are declared inside of.
class Foo {
method bar () { say self, '!' }
# has method bar () { say self, '!' }
regex baz { :ignorecase 'hello world' }
# has regex baz () { :ignorecase 'hello world' }
}
A regex does need some requirements fulfilled by whatever it's invocant is.
By just running it as a subroutine, it tells you the first one:
my regex baz { :ignorecase 'hello world' }
baz 'Hello World';
No such method '!cursor_start' for invocant of type 'Str'
in regex baz at <unknown file> line 1
in block <unit> at <unknown file> line 1
Usually a regex is declared inside of a class declared with grammar
.
grammar Foo {
}
say Foo.^mro;
# ((Foo) (Grammar) (Match) (Capture) (Cool) (Any) (Mu))
So the requirements are likely fulfilled by Grammar
, Match
, or Capture
in this case.
It could also be from a role that gets composed with it.
say Foo.^roles.map(*.^name);
# (NQPMatchRole)
There is even more reason to believe that it is Match
or Capture
my regex baz {
^
{ say 'baz was called on: ', self.^name }
}
&baz.ACCEPTS(''); # baz was called on: Match
my regex baz ( $s ) {
:ignorecase
"$s"
}
baz Match.new(orig => 'Hello World'), 'hello';
# 「Hello」
I see no reason someone couldn't do that themselves in a normal class though.
Note that $/
is just a variable. So saying it is passed to a regex is a misunderstanding of the situation.
my regex baz ( $/ ) {
:ignorecase
"$/"
}
'Hello World' ~~ /<baz('hello')>/;
# 「Hello」
# baz => 「Hello」
It would be more accurate to say that when calling a regex from inside of another one, the current $/
is used as the invocant to the method/regex.
(I'm not entirely sure this is actually what happens.)
So the previous example would then be sort-of like this:
'Hello World' ~~ /{ $/.&baz('hello') }/;
This explanation combines what I think Brad++ and Jonathan++ just taught me, with what I thought I already knew, with what I discovered as I dug further.
(My original goal was to directly explain Brad's mysterious No such method '!cursor_start'
message. I've failed for now, and have instead just filed a bug report, but here's what else I ended up with.)
Methods are designed to work naturally in classes. Indeed a method declaration without a scope declarator assumes has
-- and a has
declaration belongs inside a class:
method bar {} # Useless declaration of a has-scoped method in mainline
But in fact methods also work fine as either:
sub
s (i.e. not behaving as an object oriented method at all); or
methods for prototype-based programming (i.e. object orientation, but without classes).
What really makes methods methods is that they are routines with an "invocant". An invocant is a special status first parameter that:
Mu
: class foo { my method bar {} .signature .say } # (foo: *%_)
my method bar {} .signature .say # (Mu: *%_)
my method bar {}
bar # Too few positionals passed; expected 1 argument but got 0
self
. Thus: my method bar { say self }
bar 42 # 42
:
). Thus: my method bar (Int \baz:) { say baz }
say &bar.signature; # (Int \baz: *%_)
bar 42; # 42
bar 'string'; # Type check failed in binding to parameter 'baz'
Focusing just on the invocant perspective, regexes are methods that take/expect a match object as their invocant.
A regex is typically called in three somewhat different scenarios:
By direct use. For example my regex foo { . }; say 'a' ~~ &foo; # 「a」
(or just say 'a' ~~ / . /; # 「a」
, but I'll only cover the essentially identical named example to simplify my explanation). This translates to say &foo.ACCEPTS: 'a'
. This in turn is implemented by this code in Rakudo. As you can see, this calls the regex foo
with the invocant Match.'!cursor_init'(...)
-- which runs this code without :build
. The upshot is that foo
gets a new Match
object as its invocant.
By way of the Grammar
class's .parse
method. The .parse
method creates a new instance of the grammar and then calls the top "rule" (rule
/token
/regex
/method
) on that new grammar object. Note that a Grammar
is a sub-class of Match
; so, just as with the first scenario, the rule/regex is being passed an as-yet-empty match object. If the top rule matches, the new grammar/match object will be returned by the call to .parse
. (Otherwise it'll return Nil
.)
By way of one of the above. The top rule in a grammar will typically contain calls to lower level rules/tokens/regexes/methods. Likewise, a free standing rule/regex may contain calls to other rules/regexes. Each such call will involve creating another new grammar/match object that becomes the invocant for the nested call. If the nested call matches, and it's a capturing call, then the new grammar/match object is added to the higher level grammar/match object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With