Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are `\Q` `\E` in a Perl pattern in some cases interpreted as literal `Q` `E`?

Tags:

regex

perl

If ' is the delimiter, or if interpolated from a variable, the regular expression \QW\ER matches QWER and not WR (observed with v5.6.2, v5.10.1 and v5.18.2 of Perl and at http://www.perlfect.com/articles/regextutor.shtml), i. e., \Q \E in the pattern are not interpreted as a quoting escape, but as literal Q E.

Example:

#!/usr/bin/env perl
$re = '\QW\ER';
print '$re = ', $re, "\n";
while (<DATA>)
{
    print qw(/\QW\ER/), "  matches ", $_ if /\QW\ER/;
    print qw(m'\QW\ER'), " matches ", $_ if m'\QW\ER';
    print qw(/$re/), "     matches ", $_ if /$re/;
}
__DATA__
QWERT
WRONG

Output:

$re = \QW\ER
m'\QW\ER' matches QWERT
/$re/     matches QWERT
/\QW\ER/  matches WRONG

(Only the last line is what I had expected.)

Is this a bug? ... a feature? ... documented anywhere?

like image 546
Armali Avatar asked Jan 14 '14 12:01

Armali


2 Answers

You may be observing this if you are using a string with escapes to define a regex:

# don't use strings if you have escapes:
#  my $re = '(?<=\QW\E)R';
my $re = qr/(?<=\QW\E)R/;
/($re)/ and print "$_: $1\n" for qw(QWERT WRONG);
like image 100
perreal Avatar answered Nov 15 '22 07:11

perreal


I found the explanation in the Perl Language reference, section perlop:

The following escape sequences are available in constructs that interpolate, ...

\Q          quote (disable) pattern metacharacters till \E or
            end of string
\E          end either case modification or quoted section
            (whichever was last seen)

Since '\QW\ER' is a construct where the delimiter is ' ', which provides no interpolating, \Q \E are not available as a quoting escape here.

like image 45
Armali Avatar answered Nov 15 '22 06:11

Armali