How can I use capturing groups inside lookahead assertion?
This code:
say "ab" ~~ m/(a) <?before (b) > /;
returns:
「a」
0 => 「a」
But I was expecting to also capture 'b'.
Is there a way to do so?
I don't want to leave 'b' outside of the lookahead because I don't want 'b' to be part of the match.
Is there a way to capture 'b' but still leave it outside of the match?
NOTE:
I tried to use Raku's capture markers, as in:
say "ab" ~~ m/<((a))> (b) /;
「a」
0 => 「a」
1 => 「b」
But this does not seem to work as I expect because even if 'b' is left ouside the match, the regex has processed 'b'. And I don't want to be processed too.
For example:
say 'abab' ~~ m:g/(a)<?before b>|b/;
(「a」
0 => 「a」
「b」
「a」
0 => 「a」
「b」)
# Four matches (what I want)
say 'abab' ~~ m:g/<((a))>b|b/;
(「a」
0 => 「a」
「a」
0 => 「a」)
# Two matches
If your regular expression has named capturing groups, then you should use named backreferences to them in the replacement text. The regex (?' name'group) has one group called “name”. You can reference this group with ${name} in the JGsoft applications, Delphi, .
Regular expressions allow us to not just match text but also to extract information for further processing. This is done by defining groups of characters and capturing them using the special parentheses ( and ) metacharacters. Any subpattern inside a pair of parentheses will be captured as a group.
The positive lookahead construct is a pair of parentheses, with the opening parenthesis followed by a question mark and an equals sign. You can use any regular expression inside the lookahead (but not lookbehind, as explained below). Any valid regular expression can be used inside the lookahead.
Lookbehind, which is used to match a phrase that is preceded by a user specified text. Positive lookbehind is syntaxed like (? <=a)something which can be used along with any regex parameter. The above phrase matches any "something" word that is preceded by an "a" word. Negative Lookbehind is syntaxed like (?
Is there a way to do so?
Not really, but sort of. Three things conspire against us in trying to make this happen.
(a(b))
results in one positional capture that contains another positional capture. Why do I mention this? Because the same thing is going on with things like before
, which take a regex as an argument: the regex passed to before
gets its own Match
object.?
implies "do not capture". We may think of dropping it to get <before (b)>
, and indeed there is a before
key in the Match
object now, which sounds promising except...before
doesn't actually return what it matched on the inside, but instead a zero-width Match
object, otherwise if we did forget the ?
we'd end up with it not being a lookahead.If only we could rescue the Match
object from inside of the lookahead. Well, we can! We can declare a variable and then bind the $/
inside of the before
argument regex into it:
say "ab" ~~ m/(a) :my $lookahead; <?before b {$lookahead = $/}> /;
say $lookahead;
Which gives:
「a」
0 => 「a」
「b」
Which works, although it's unfortunately not attached like a normal capture. There's not a way to do that, although we can attach it via make
:
say "ab" ~~ m/(a) :my $lookahead; <?before (b) {$lookahead = $0}> { make $lookahead } /;
say $/.made;
With the same output, except now it will be reliably attached to each match object coming back from m:g
, and so will be robust, even if not beautiful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With