I have the (what I believe to be) negative lookahead assertion <@> *(?!QQQ)
that I expect to match if the tested string is a <@>
followed by any number of spaces (zero including) and then not followed by QQQ
.
Yet, if the tested string is <@> QQQ
the regular expression matches.
I fail to see why this is the case and would appreciate any help on this matter.
Here's a test script
use warnings;
use strict;
my @strings = ('something <@> QQQ',
'something <@> RRR',
'something <@>QQQ' ,
'something <@>RRR' );
print "$_\n" for map {$_ . " --> " . rep($_) } (@strings);
sub rep {
my $string = shift;
$string =~ s,<@> *(?!QQQ),at w/o ,;
$string =~ s,<@> *QQQ,at w/ QQQ,;
return $string;
}
This prints
something <@> QQQ --> something at w/o QQQ
something <@> RRR --> something at w/o RRR
something <@>QQQ --> something at w/ QQQ
something <@>RRR --> something at w/o RRR
And I'd have expected the first line to be something <@> QQQ --> something at w/ QQQ
.
A lookahead assertion has the form (?= test) and can appear anywhere in a regular expression. MATLAB® looks ahead of the current location in the text for the test condition. If MATLAB matches the test condition, it continues processing the rest of the expression to find a match.
Regular Expression (Regex or RE) in Perl is when a special string describing a sequence or the search pattern in the given string. An Assertion in Regular Expression is when a match is possible in some way.
The \G assertion in Perl allows you to continue searching from the point where the last match occurred.
Positive lookahead: (?= «pattern») matches if pattern matches what comes after the current location in the input string. Negative lookahead: (?! «pattern») matches if pattern does not match what comes after the current location in the input string.
It matches because zero is included in "any number". So no spaces, followed by a space, matches "any number of spaces not followed by a Q".
You should add another lookahead assertion that the first thing after your spaces is not itself a space. Try this (untested):
<@> *(?!QQQ)(?! )
ETA Side note: changing the quantifier to + would have helped only when there's exactly one space; in the general case, the regex can always grab one less space and therefore succeed. Regexes want to match, and will bend over backwards to do so in any way possible. All other considerations (leftmost, longest, etc) take a back seat - if it can match more than one way, they determine which way is chosen. But matching always wins over not matching.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With