I am trying to match any keywords in a group. Keywords are in array @b. I am unable to make case-insensitive matches. I have done some testing, and the following script is an example:
> my $line = "this is a test line";
this is a test line
> my @b = < tes lin > ;
[tes lin]
> my regex a { || @b };
regex a { || @b }
> say $line ~~ m:i/ <a> / # matching the first as expected
「tes」
a => 「tes」
> say $line ~~ m:i:g/ <a> / # matching both as expected
(「tes」
a => 「tes」 「lin」
a => 「lin」)
> my @b = < tes LIN > ;
[tes LIN]
> my regex a { || @b };
regex a { || @b }
> say $line ~~ m:i:g/ <a> / # should match both "tes" and "LIN" but skips "LIN"
(「tes」
a => 「tes」)
> my @b = < TES lin >
[TES lin]
> my regex a { || @b }
regex a { || @b }
> say $line ~~ m:i:g/ <a> / # expect to match both but skips "TES"
(「lin」
a => 「lin」)
Also, mapping to all lower cases does not work:
> my @b = < TES lin >.lc
[tes lin]
> my regex a { || @b }
regex a { || @b }
> say $line ~~ m:i:g/ <a> /
()
My question is, how should case-insensitivity be handled when a regex/subrule is actually called?
I tried to put :i adverb inside regex a but the resulting matches are futile:
> my regex a { :i || @b }
regex a { :i || @b }
> say $line ~~ m:i:g/ <a> /
(「」
a => 「」 「」
and then 19 lines of "a => 「」 「」"
a => 「」)
The problem with:
my regex a { || @b }
say $line ~~ m:i/ <a> /
Is that a
is the regex in charge of matching the values in @b
, and it isn't compiled with :i
.
In Perl6 regexes are code, you can't change how a regex works from a distance like that.
Then there is another problem with:
my regex a { :i || @b }
It is really compiled as:
my regex a {
[ :i ]
||
[ @b ]
}
That is match ignorecase[nothing]
and if that fails (it won't fail) match one of the values in @b
.
The only reason to use || @…
is so that it matches the values in @…
in the order they are defined.
> my @c = < abc abcd foo >;
> say 'abcd' ~~ / || @c /
「abc」
I think that in most cases it would actually work better to just let it be the default |
semantics.
> my @c = < abc abcd foo >;
> say 'abcd' ~~ / | @c /
「abcd」
> say 'abcd' ~~ / @c /
「abcd」
So then this would work the way you want it to:
my regex a { :i @b }
That is <a>|<b>
will match whichever has the longest starting expression. While <a>||<b>
will try <a>
first, and if that fails it will try <b>
.
If you really want ||
semantics, any of these would work:
my regex a { || :i @b }
my regex a { :i [|| @b] }
The following doesn't have ||
semantics.
In fact the ||
doesn't do anything.
my regex a { || [:i @b] }
It is the same as these:
my regex a { | :i @b }
my regex a { :i @b }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With