I want 'This Is A 101 Test' to be 'This Is A Test', but I can't get the syntax right.
src = 'This Is A 101 Test'
puts "A) " + src # base => "This Is A 101 Test"
puts "B) " + src[/([a-z]+)/] # only does first word => "his"
puts "C) " + src.gsub!(/\D/, "") # Does digits, I want alphabetic => "101"
puts "D) " + src.gsub!(/\W///g) # Nothing. => ""
puts "E) " + src.gsub(/(\W|\d)/, "") # Nothing. => ""
To remove all non-alphanumeric characters from a string, call the replace() method, passing it a regular expression that matches all non-alphanumeric characters as the first parameter and an empty string as the second. The replace method returns a new string with all matches replaced. Copied!
replaceAll() method. A common solution to remove all non-alphanumeric characters from a String is with regular expressions. The idea is to use the regular expression [^A-Za-z0-9] to retain only alphanumeric characters in the string. You can also use [^\w] regular expression, which is equivalent to [^a-zA-Z_0-9] .
A simple solution is to use regular expressions for removing non-alphanumeric characters from a string. The idea is to use the special character \W , which matches any character which is not a word character.
First off, you need to be careful with gsub
and gsub!
. The latter is "dangerous!" and will modify the value of src
. If you're executing these statements in order, be aware that a.gsub!(/a/, "b")
and a = a.gsub(/a/, "b")
will both do the same thing to a
. Part of the issue with your code is that src
is being modified.
The B method returns "his"
but makes no changes to source
src[/([a-z]+)/] # => "his"
src # => "This Is A 101 Test"
The C method removes all characters that aren't numbers:
src.gsub!(/\D/, "") # => "101"
src # => "101"
The D method doesn't work because the syntax is wrong. The gsub
method accepts a regular expression/string to search and then a string to use for replacement. If you try it in IRB it will act as though you need another /
somewhere.
The E method replaces all non-word characters and all numbers:
src.gsub(/(\W|\d)/, "") # => "This Is A Test" (note the two spaces)
src # => "This Is A 101 Test"
You point out that it's returning ""
. Well, what's actually happening is that C and D as listed (with syntax issues fixed) are destructive changes. (Also, if run on "101"
, D will actually return nil
as no substitutions were performed.) So E is just being run on "101"
, and since you're replacing all non-words and all numbers with ""
, it becomes "101"
.
The answer you're looking for would be something like:
src.gsub!(/\d\s?/, "") # => "This Is A Test"
src # => "This Is A Test"
And my favorite for dealing with all scenarios of double spaces (because squeeze
is quite efficient at combining like characters, strip
is quite efficient at stripping trailing whitespace, and those !
return nil
if they make no replacements):
src = src.gsub(/\d+/, "").squeeze(" ").strip
To remove all "non word characters" you can instead keep only those.
src = 'This Is A 101 Test'
src.gsub(/[^a-zA-Z ]/,'').gsub(/ +/,' ')
=> "This Is A Test"
I recommend Rubular for trying out Ruby regular expressions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With