Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does String#gsub double content?

s = "#main= 'quotes'
s.gsub "'", "\\'" # => "#main= quotes'quotes"

This seems to be wrong, I expect to get "#main= \\'quotes\\'"

when I don't use escape char, then it works as expected.

s.gsub "'", "*" # => "#main= *quotes*"

So there must be something to do with escaping.

Using ruby 1.9.2p290

I need to replace single quotes with back-slash and a quote.

Even more inconsistencies:

"\\'".length # => 2
"\\*".length # => 2

# As expected
"'".gsub("'", "\\*").length # => 2
"'a'".gsub("'", "\\*") # => "\\*a\\*" (length==5)

# WTF next:
"'".gsub("'", "\\'").length # => 0

# Doubling the content?
"'a'".gsub("'", "\\'") # => "a'a" (length==3)

What is going on here?

like image 692
Dmytrii Nagirniak Avatar asked Aug 16 '11 06:08

Dmytrii Nagirniak


People also ask

Why string is immutable with example?

In the String constant pool, a String object is likely to have one or many references. If several references point to the same String without even knowing it, it would be bad if one of the references modified that String value. That's why String objects are immutable.

Why is string used for?

Strings are like sentences. They are formed by a list of characters, which is really an "array of characters". Strings are very useful when communicating information from the program to the user of the program. They are less useful when storing information for the computer to use.

Why string is immutable reason?

The String is immutable in Java because of the security, synchronization and concurrency, caching, and class loading. The reason of making string final is to destroy the immutability and to not allow others to extend it. The String objects are cached in the String pool, and it makes the String immutable.

Why is string final in Java?

String class is made final in Java in order to make the String objects immutable. Making an object immutable helps in two ways: Security: the system can hand out sensitive bits of read-only information without worrying that they will be altered. Performance: immutable data is very useful in making things thread-safe.

Why is text called a string?

Strings are called "strings" because they are made up of a sequence, or string, of characters.


2 Answers

s = "#main = 'quotes'

s.gsub "'", "\\\\'"

Since \it's \\equivalent if you want to get a double backslash you have to put four of ones.

like image 24
Kleber S. Avatar answered Sep 30 '22 14:09

Kleber S.


You're getting tripped up by the specialness of \' inside a regular expression replacement string:

\0, \1, \2, ... \9, \&, \`, \', \+
Substitutes the value matched by the nth grouped subexpression, or by the entire match, pre- or postmatch, or the highest group.

So when you say "\\'", the double \\ becomes just a single backslash and the result is \' but that means "The string to the right of the last successful match." If you want to replace single quotes with escaped single quotes, you need to escape more to get past the specialness of \':

s.gsub("'", "\\\\'")

Or avoid the toothpicks and use the block form:

s.gsub("'") { |m| '\\' + m }

You would run into similar issues if you were trying to escape backticks, a plus sign, or even a single digit.

The overall lesson here is to prefer the block form of gsub for anything but the most trivial of substitutions.

like image 61
mu is too short Avatar answered Sep 30 '22 14:09

mu is too short