I have a file with phrases such as "Canyon St / 27th Way" that I am trying to turn into "Canyon St and 27th Way" with Ruby regular expressions.
I used file = file.gsub(/(\b) \/ (\b)/, "#{$1} and #{$2}")
to make the match, but I am a little stumped about what \b really means and why $1 contains all of the characters before the word boundary that precedes the slash and why $2 contains all of the characters after the word boundary starting the next word.
Usually, I expect that whatever is in parentheses in a regular expression would be in $1 and $2, but I am not sure what parentheses around a word boundary would really mean because there really is nothing between the transition from a word character to a white space character.
The \b metacharacter matches at the beginning or end of a word.
Using regex \B-\B matches - between the word color - coded . Using \b-\b on the other hand matches the - in nine-digit and pass-key .
Word Boundary: \b The word boundary \b matches positions where one side is a word character (usually a letter, digit or underscore—but see below for variations across engines) and the other side is not a word character (for instance, it may be the beginning of the string or a space character).
Matches only at the start of the string. \b. Matches the empty string, but only at the beginning or end of a word. A word is defined as a sequence of word characters. Note that formally, \b is defined as the boundary between a \w and a \W character (or vice versa), or between \w and the beginning/end of the string.
The parentheses aren't doing anything in this context. You could get the same result using /\b \/ \b/
.
I think you are getting a little confused by $1
and $2
. Those aren't actually doing anything either. They are nil because they are matching nothing (just a word boundry). What you have written is the logical equivalent of .gsub(/\b \/ \b/, " and ")
The $1 and $2 are not actually related to your regex match: a method's arguments are evaluated before the method is called, so
"#{$1} and #{$2}"
Is evaluated before the regex is matched against your string. If you haven't done earlier regex matches then these variables will be nil, so you're actually doing
file = file.gsub(/(\b) \/ (\b)/, " and ")
that is you are replacing a slash surrounded by spaces by "and", also surrounded by spaces. $1 and $2 will be updated to be empty strings, and so you'll see the same behaviour when you process the next string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With