How to make subrule/regex case-insensitive when used in match?

Question

I am trying to match any keywords in a group. Keywords are in array @b. I am unable to make case-insensitive matches. I have done some testing, and the following script is an example:

> my $line = "this is a test line";
this is a test line

> my @b = < tes lin > ; 
[tes lin]

> my regex a { || @b };
regex a { || @b }

> say $line ~~ m:i/ <a> /    # matching the first as expected
｢tes｣
 a => ｢tes｣

> say $line ~~ m:i:g/ <a> /  # matching both as expected
(｢tes｣
 a => ｢tes｣ ｢lin｣
 a => ｢lin｣)

> my @b = < tes LIN > ; 
[tes LIN]
> my regex a { || @b };
regex a { || @b }
> say $line ~~ m:i:g/ <a> /   # should match both "tes" and "LIN" but skips "LIN"
(｢tes｣
 a => ｢tes｣)

> my @b = < TES lin >
[TES lin]
> my regex a { || @b }
regex a { || @b }
> say $line ~~ m:i:g/ <a> /   # expect to match both but skips "TES"
(｢lin｣
 a => ｢lin｣)

Also, mapping to all lower cases does not work:

> my @b = < TES lin >.lc
[tes lin]
> my regex a { || @b }
regex a { || @b }
> say $line ~~ m:i:g/ <a> /
()

My question is, how should case-insensitivity be handled when a regex/subrule is actually called?

I tried to put :i adverb inside regex a but the resulting matches are futile:

> my regex a { :i || @b }
regex a { :i || @b }
> say $line ~~ m:i:g/ <a> /
(｢｣
 a => ｢｣ ｢｣

and then 19 lines of "a => ｢｣｢｣"

 a => ｢｣)

Brad Gilbert · Accepted Answer

The problem with:

my regex a { || @b }
say $line ~~ m:i/ <a> /

Is that a is the regex in charge of matching the values in @b, and it isn't compiled with :i.
In Perl6 regexes are code, you can't change how a regex works from a distance like that.

Then there is another problem with:

my regex a { :i || @b }

It is really compiled as:

my regex a {
     [ :i    ]
  ||
     [    @b ]
}

That is match ignorecase[nothing] and if that fails (it won't fail) match one of the values in @b.

The only reason to use || @… is so that it matches the values in @… in the order they are defined.

> my @c = < abc abcd foo >;

> say 'abcd' ~~ / || @c /
｢abc｣

I think that in most cases it would actually work better to just let it be the default | semantics.

> my @c = < abc abcd foo >;

> say 'abcd' ~~ / |  @c /
｢abcd｣
> say 'abcd' ~~ /    @c /
｢abcd｣

So then this would work the way you want it to:

my regex a { :i @b }

That is <a>|<b> will match whichever has the longest starting expression. While <a>||<b> will try <a> first, and if that fails it will try <b>.

If you really want || semantics, any of these would work:

my regex a {     ||  :i @b  }
my regex a { :i [||     @b] }

The following doesn't have || semantics.
In fact the || doesn't do anything.

my regex a {     || [:i @b] }

It is the same as these:

my regex a {     |   :i @b  }
my regex a {         :i @b  }

How to make subrule/regex case-insensitive when used in match?

Tags:

regex

case-insensitive

match

raku

lisprogtor

1 Answers

Brad Gilbert

Recent Activity

Donate For Us

How to make subrule/regex case-insensitive when used in match?

Tags:

regex

case-insensitive

match

raku

lisprogtor

1 Answers

Brad Gilbert

Related questions

Recent Activity

Donate For Us