Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change case of letters in string using RegEx in Ruby

Tags:

regex

ruby

Say I have a string : "hEY "

I want to convert it to "Hey "

string.gsub!(/([a-z])([A-Z]+ )/, '\1'.upcase)

That is the idea I have, but it seems like the upcase method does nothing when I use it within the gsub method. Why is that?

EDIT: I came up with this method:

string.gsub!(/([a-z])([A-Z]+ )/) { |str| str.downcase!.capitalize! }

Is there a way to do this within the regex though? I don't really understand the '\1' '\2' thing. Is that backreferencing? How does that work

like image 654
ordinary Avatar asked Mar 26 '13 00:03

ordinary


3 Answers

@sawa Has the simple answer, and you've edited your question with another mechanism. However, to answer two of your questions:

Is there a way to do this within the regex though?

No, Ruby's regex does not support a case-changing feature as some other regex flavors do. You can "prove" this to yourself by reviewing the official Ruby regex docs for 1.9 and 2.0 and searching for the word "case":

  • https://github.com/ruby/ruby/blob/ruby_1_9_3/doc/re.rdoc
  • https://github.com/ruby/ruby/blob/ruby_2_0_0/doc/re.rdoc

I don't really understand the '\1' '\2' thing. Is that backreferencing? How does that work?

Your use of \1 is a kind of backreference. A backreference can be when you use \1 and such in the search pattern. For example, the regular expression /f(.)\1/ will find the letter f, followed by any character, followed by that same character (e.g. "foo" or "f!!").

In this case, within a replacement string passed to a method like String#gsub, the backreference does refer to the previous capture. From the docs:

"If replacement is a String it will be substituted for the matched text. It may contain back-references to the pattern’s capture groups of the form \d, where d is a group number, or \k<n>, where n is a group name. If it is a double-quoted string, both back-references must be preceded by an additional backslash."

In practice, this means:

"hello world".gsub( /([aeiou])/, '_\1_' )  #=> "h_e_ll_o_ w_o_rld"
"hello world".gsub( /([aeiou])/, "_\1_" )  #=> "h_\u0001_ll_\u0001_ w_\u0001_rld"
"hello world".gsub( /([aeiou])/, "_\\1_" ) #=> "h_e_ll_o_ w_o_rld"

Now, you have to understand when code runs. In your original code…

string.gsub!(/([a-z])([A-Z]+ )/, '\1'.upcase)

…what you are doing is calling upcase on the string '\1' (which has no effect) and then calling the gsub! method, passing in a regex and a string as parameters.

Finally, another way to achieve this same goal is with the block form like so:

# Take your pick of which you prefer:
string.gsub!(/([a-z])([A-Z]+ )/){ $1.upcase << $2.downcase }
string.gsub!(/([a-z])([A-Z]+ )/){ [$1.upcase,$2.downcase].join }
string.gsub!(/([a-z])([A-Z]+ )/){ "#{$1.upcase}#{$2.downcase}" }

In the block form of gsub the captured patterns are set to the global variables $1, $2, etc. and you can use those to construct the replacement string.

like image 51
Phrogz Avatar answered Oct 19 '22 11:10

Phrogz


I don't know why you are trying to do it in a complicated way, but the usual way is:

"hEY".capitalize # => "Hey"

If you insist in using a regex and upcase, then you would also need downcase:

"hEY".downcase.sub(/\w/){$&.upcase} # => "Hey"
like image 10
sawa Avatar answered Oct 19 '22 12:10

sawa


If you really want to just swap the case of every letter in the string, you can avoid the complexity of regex entirely because There's A Method For That™.

"hEY".swapcase # => "Hey"
"HellO thERe".swapcase # => "hELLo THerE"

There's also swapcase! to do it destructively.

like image 7
Scott Olson Avatar answered Oct 19 '22 13:10

Scott Olson