Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does my non-greedy Perl regex match nothing?

I thought I understood Perl RE to a reasonable extent, but this is puzzling me:

#!/usr/bin/perl
use strict;
use warnings;

my $test = "'some random string'";

if($test =~ /\'?(.*?)\'?/) {
       print "Captured $1\n";
       print "Matched $&";
}
else {
       print "What?!!";
}

prints

Captured
Matched '

It seems it has matched the ending ' alone, and so captured nothing.
I would have expected it to match the entire thing, or if it's totally non-greedy, nothing at all (as everything there is an optional match).
This in between behaviour baffles me, can anyone explain what is happening?

like image 452
Sundar R Avatar asked Nov 29 '22 20:11

Sundar R


2 Answers

The \'? at the beginning and end means match 0 or 1 apostrophes greedily. (As another poster has pointed out, to make it non-greedy, it would have to be \'??)

The .*? in the middle means match 0 or more characters non-greedily.

The Perl regular expression engine will look at the first part of the string. It will match the beginning, but does so greedily, so it picks up the first apostrophe. It then matches non-greedily (so takes as little as it can) followed by an optional apostrophe. This is matched by the empty string.

like image 133
Simon Nickerson Avatar answered Dec 05 '22 05:12

Simon Nickerson


I think you mean something like:

/'(.*?)'/      // matches everything in single quotes

or

/'[^']*'/      // matches everything in single quotes, but faster

The singe quotes don't need to be escaped, AFAIK.

like image 37
Tomalak Avatar answered Dec 05 '22 06:12

Tomalak