Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lookaround regex and character consumption

Based on the documentation for Raku's lookaround assertions, I read the regex / <?[abc]> <alpha> / as saying "starting from the left, match but do not not consume one character that is a, b, or c and, once you have found a match, match and consume one alphabetic character."

Thus, this output makes sense:

'abc' ~~ / <?[abc]> <alpha> /     # OUTPUT: «「a」␤ alpha => 「a」»

Even though that regex has two one-character terms, one of them does not capture so our total capture is only one character long.

But next expression confuses me:

'abc' ~~ / <?[abc\s]> <alpha> /     # OUTPUT: «「ab」␤ alpha => 「b」»

Now, our total capture is two characters long, and one of those isn't captured by <alpha>. So is the lookaround capturing something after all? Or am I misunderstanding something else about how the lookaround works?

like image 240
codesections Avatar asked Jan 30 '26 01:01

codesections


1 Answers

<?[ ]> and <![ ]> does not seem to support some backslashed character classes. \n, \s, \d and \w show similar results.

<?[abc\s]> behaves the same as <[abc\s]> when \n, \s, \d or \w is added.

\t, \h, \v, \c[NAME] and \x61 seem to work as normal.

like image 93
Markus Jarderot Avatar answered Feb 01 '26 18:02

Markus Jarderot



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!