Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is the behavior of (.)+ in a regex?

Tags:

regex

perl

We just found a bug in some code where the programmer had used the equivalent of (.)+ when they should have used (.+). An easy enough fix, but we're unable to explain the behavior of (.)+. Can anyone explain why this matches "e", the last letter, and not "b", the first letter after the "a" in the regex? How would you explicate (.)+?

my $s = 'abcde';

if ($s =~ m{ a (.)+  }x ){
    print "s '$s' matched '$1'\n";
}else{
    print "total match fail\n";
}

__END__
output:
s 'abcde' matched 'e'
like image 695
Kevin G. Avatar asked Oct 05 '15 16:10

Kevin G.


1 Answers

There's a huge difference between (.)+ and (.+) but only in terms of what is captured, not what is matched.

(.)+ looks for one or more instances of a single character and captures the last of these.

(.+) looks for one or more single characters and captures all of them at once.

like image 141
tadman Avatar answered Nov 08 '22 17:11

tadman