There is something mysterious to me about the escape status of a backslash within a single quoted string literal as argument of String#tr
. Can you explain the contrast between the three examples below? I particularly do not understand the second one. To avoid complication, I am using 'd'
here, which does not change the meaning when escaped in double quotation ("\d"
= "d"
).
'\\'.tr('\\', 'x') #=> "x"
'\\'.tr('\\d', 'x') #=> "\\"
'\\'.tr('\\\d', 'x') #=> "x"
String literal syntaxUse the escape sequence \\ to represent a backslash character as part of the string. You can represent a single quotation mark symbol either by itself or with the escape sequence \' . You must use the escape sequence \" to represent a double quotation mark.
The r means that the string is to be treated as a raw string, which means all escape codes will be ignored. For an example: '\n' will be treated as a newline character, while r'\n' will be treated as the characters \ followed by n .
String literals. A string literal represents a sequence of characters that together form a null-terminated string. The characters must be enclosed between double quotation marks.
A string literal is a sequence of zero or more characters enclosed by single quotes. The null string ( '' ) contains zero characters. A string literal can hold up to 32,767 characters. PL/SQL is case sensitive within string literals. For example, PL/SQL considers the literals 'white' and 'White' to be different.
tr
The first argument of tr
works much like bracket character grouping in regular expressions. You can use ^
in the start of the expression to negate the matching (replace anything that doesn't match) and use e.g. a-f
to match a range of characters. Since it has control characters, it also does escaping internally, so you can use -
and ^
as literal characters.
print 'abcdef'.tr('b-e', 'x') # axxxxf
print 'abcdef'.tr('b\-e', 'x') # axcdxf
Furthermore, when using single quotes, Ruby tries to include the backslash when possible, i.e. when it's not used to actually escape another backslash or a single quote.
# Single quotes
print '\\' # \
print '\d' # \d
print '\\d' # \d
print '\\\d' # \\d
# Double quotes
print "\\" # \
print "\d" # d
print "\\d" # \d
print "\\\d" # \d
With all that in mind, let's look at the examples again.
'\\'.tr('\\', 'x') #=> "x"
The string defined as '\\'
becomes the literal string \
because the first backslash escapes the second. No surprises there.
'\\'.tr('\\d', 'x') #=> "\\"
The string defined as '\\d'
becomes the literal string \d
. The tr
engine, in turn uses the backslash in the literal string to escape the d
. Result: tr
replaces instances of d
with x.
'\\'.tr('\\\d', 'x') #=> "x"
The string defined as '\\\d'
becomes the literal \\d
. First \\
becomes \
. Then \d
becomes \d
, i.e. the backslash is preserved. (This particular behavior is different from double strings, where the backslash would be eaten alive, leaving only a lonesome d
)
The literal string \\d
then makes tr
replace all characters that are either a backslash or a d
with the replacement string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With