Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby Regexp: difference between new and union with a single regexp

Tags:

regex

ruby

I have simplified the examples. Say I have a string containing the code for a regex. I would like the regex to match a literal dot and thus I want it to be:

\.

So I create the following Ruby string:

"\\."

However when I use it with Regexp.union to create my regex, I get this:

irb(main):017:0> Regexp.union("\\.")
=> /\\\./

That will match a slash followed by a dot, not just a single dot. Compare the previous result to this:

irb(main):018:0> Regexp.new("\\.")
=> /\./

which gives the Regexp I want but without the needed union.

Could you explain why Ruby acts like that and how to make the correct union of regexes ? The context of utilization is that of importing JSON strings describing regexes and union-ing them in Ruby.

like image 961
Ludovic Kuty Avatar asked Oct 16 '11 13:10

Ludovic Kuty


1 Answers

Passing a string to Regexp.union is designed to match that string literally. There is no need to escape it, Regexp.escape is already called internally.

Regexp.union(".")
#=> /\./

If you want to pass regular expressions to Regexp.union, don't use strings:

Regexp.union(Regexp.new("\\."))
#=> /\./
like image 82
molf Avatar answered Sep 24 '22 16:09

molf