Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weirdness with gsub

I was trying to use gsub to remove non word characters in a string in a rails app. I used the following code:

somestring.gsub(/[\W]/i, '')  #=> ""

but it is actually incorrect, it will remove letter k as well. The correct one should be:

somestring.gsub(/\W/i, '')  #=> "kkk"

But my problem is that the unit test of a rails controller which contains the above code using rspec does not work, the unit test actually passes. So I created a pretty extreme test case in rspec

it "test this gsub" do
  'kkk'.gsub(/[\W]/i, '').should == 'kkk'
end

the above test case should fail, but it actually passes. What is the problem here? Why would the test pass?

like image 658
Ben Avatar asked Apr 27 '12 15:04

Ben


1 Answers

Ruby 1.9 switched to a different regular expression engine (Oniguruma), which accounts for the behavior change. This seems like a bug in it.

For your example, you can get around the issue by not specifying a case insensitive match:

irb(main):001:0> 'kkk'.gsub(/[\W]/i, '')
=> ""
irb(main):002:0> 'kkk'.gsub(/[\W]/, '')
=> "kkk"
irb(main):004:0> 'kkk'.gsub(/\W/i, '')
=> "kkk"
irb(main):003:0> 'kkk'.gsub(/\W/, '')
=> "kkk"

Update: It looks like removing the character group is another approach. It might be that negated matches like that aren't necessarily valid in a character group?

like image 65
Nevir Avatar answered Sep 18 '22 15:09

Nevir