Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a bug in Ruby lookbehind assertions (1.9/2.0)?

Why doesn't the regex (?<=fo).* match foo (whereas (?<=f).* does)?

"foo" =~ /(?<=f).*/m          => 1 "foo" =~ /(?<=fo).*/m         => nil 

This only seems to happen with singleline mode turned on (dot matches newline); without it, everything is OK:

"foo" =~ /(?<=f).*/           => 1 "foo" =~ /(?<=fo).*/          => 2 

Tested on Ruby 1.9.3 and 2.0.0.

See it on Rubular

EDIT: Some more observations:

Adding an end-of-line anchor doesn't change anything:

"foo" =~ /(?<=fo).*$/m        => nil 

But together with a lazy quantifier, it "works":

"foo" =~ /(?<=fo).*?$/m       => 2 

EDIT: And some more observations:

.+ works as does its equivalent {1,}, but only in Ruby 1.9 (it seems that that's the only behavioral difference between the two in this scenario):

"foo" =~ /(?<=fo).+/m         => 2 "foo" =~ /(?<=fo).{1,}/       => 2 

In Ruby 2.0:

"foo" =~ /(?<=fo).+/m         => nil "foo" =~ /(?<=fo).{1,}/m      => nil 

.{0,} is busted (in both 1.9 and 2.0):

"foo" =~ /(?<=fo).{0,}/m      => nil 

But {n,m} works in both:

"foo" =~ /(?<=fo).{0,1}/m     => 2 "foo" =~ /(?<=fo).{0,2}/m     => 2 "foo" =~ /(?<=fo).{0,999}/m   => 2 "foo" =~ /(?<=fo).{1,999}/m   => 2 
like image 602
Tim Pietzcker Avatar asked Mar 05 '13 21:03

Tim Pietzcker


1 Answers

This has been officially classified as a bug and subsequently fixed, together with another problem concerning \Z anchors in multiline strings.

like image 58
Tim Pietzcker Avatar answered Oct 01 '22 04:10

Tim Pietzcker