Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inconsistent behavior in Ruby gsub replacement?

Tags:

regex

ruby

gsub

The two gsub's yield different outcomes. Can anybody explain why?

Code is also available at https://gist.github.com/franklsf95/6c0f8938f28706b5644d.

    ver = 9999
    str = "\t<key>CFBundleDevelopmentRegion</key>\n\t<string>en</string>\n\t<key>CFBundleVersion</key>\n\t<string>0.1.190</string>\n\t<key>AppID</key>\n\t<string>000000000000000</string>"
    puts str.gsub /(CFBundleVersion<\/key>\n\t.*\.).*(<\/string>)/, "#{$1}#{ver}#{$2}"
    puts '--------'
    puts str.gsub /(CFBundleVersion<\/key>\n\t.*\.).*(<\/string>)/, "#{$1}#{ver}#{$2}"

My ruby version is ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0] (MRI). On my machine, the outcome is:

<key>CFBundleDevelopmentRegion</key>
<string>en</string>
<key>9999
<key>AppID</key>
<string>000000000000000</string>
--------
<key>CFBundleDevelopmentRegion</key>
<string>en</string>
<key>CFBundleVersion</key>
<string>0.1.9999</string>
<key>AppID</key>
<string>000000000000000</string>

The second one is the desired effect, but the first one is wrong.

like image 712
franklsf95 Avatar asked Jul 10 '14 00:07

franklsf95


2 Answers

It has to do with timing and how ruby regexes work.

gsub sets $1 and $2, but not until after it completes. So when you run the first time through, they're blank. When you run the second time, they were set by the previous gsub. If you want to do regex captures in place, you need \1 and \2, like this:

puts str.gsub /(CFBundleVersion<\/key>\n\t.*\.).*(<\/string>)/, '\1' + ver.to_s + '\2'
like image 195
Some Guy Avatar answered Sep 22 '22 15:09

Some Guy


If you use the block form of gsub(), your code will work correctly:

ver = 9999

str = "\t<key>CFBundleDevelopmentRegion</key>\n\t<string>en</string>\n\t<key>CFBundleVersion</key>\n\t<string>0.1.190</string>\n\t<key>AppID</key>\n\t<string>000000000000000</string>"

puts str.gsub(/(CFBundleVersion<\/key>\n\t.*\.).*(<\/string>)/) {|match|
  "#{$1}#{ver}#{$2}"
}

puts '-' * 20

puts str.gsub(/(CFBundleVersion<\/key>\n\t.*\.).*(<\/string>)/) {|match|
  "#{$1}#{ver}#{$2}"
}

--output:--
    <key>CFBundleDevelopmentRegion</key>
    <string>en</string>
    <key>CFBundleVersion</key>
    <string>0.1.9999</string>
    <key>AppID</key>
    <string>000000000000000</string>
--------------------
    <key>CFBundleDevelopmentRegion</key>
    <string>en</string>
    <key>CFBundleVersion</key>
    <string>0.1.9999</string>
    <key>AppID</key>
    <string>000000000000000</string>

The docs describe this behavior:

If replacement is a String, ... However, within replacement the special match variables, such as $&, will not refer to the current match.

...

In the block form, the current match string is passed in as a parameter, and variables such as $1, $2, $`, $&, and $' will be set appropriately. The value returned by the block will be substituted for the match on each call.

like image 40
7stud Avatar answered Sep 18 '22 15:09

7stud