Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do the following capture group notation mean something to Perl

In the following why does the condition evaluate to false?

$_ = "aa11bb";  
if(/(.)\111/){  
    print "It matched!\n";  
}  

Does \11 or \111 have special meaning that Perl can not "see" \1?

like image 332
Jim Avatar asked Dec 20 '22 02:12

Jim


2 Answers

Actually Perl is interpreting \111 as an octal, which is not found in your string. It would only consider two or more digits backreferences if such number of groups is found. To avoid the ambiguity, use \g or \g{}. Quoting the docs (perlre - Capture Groups):

The \g and \k notations were introduced in Perl 5.10.0. Prior to that there were no named nor relative numbered capture groups. Absolute numbered groups were referred to using \1 , \2 , etc., and this notation is still accepted (and likely always will be). But it leads to some ambiguities if there are more than 9 capture groups, as \10 could mean either the tenth capture group, or the character whose ordinal in octal is 010 (a backspace in ASCII). Perl resolves this ambiguity by interpreting \10 as a backreference only if at least 10 left parentheses have opened before it. Likewise \11 is a backreference only if at least 11 left parentheses have opened before it. And so on. \1 through \9 are always interpreted as backreferences. There are several examples below that illustrate these perils. You can avoid the ambiguity by always using \g{} or \g if you mean capturing groups; and for octal constants always using \o{} , or for \077 and below, using 3 digits padded with leading zeros, since a leading zero implies an octal constant.

like image 177
sidyll Avatar answered May 03 '23 10:05

sidyll


It's treating the \111 as a single item, because there's nothing separating the \1 from the 11. If you use the /x modifier to allow spacing you can remove the ambiguity:

if(/(.)\1 11/x) { ...
like image 34
AKHolland Avatar answered May 03 '23 11:05

AKHolland