Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby RegEx problem text.gsub[^\W-], '') fails

Tags:

regex

ruby

I'm trying to learn RegEx in Ruby, based on what I'm reading in "The Rails Way". But, even this simple example has me stumped. I can't tell if it is a typo or not:

text.gsub(/\s/, "-").gsub([^\W-], '').downcase

It seems to me that this would replace all spaces with -, then anywhere a string starts with a non letter or number followed by a dash, replace that with ''. But, using irb, it fails first on ^:

syntax error, unexpected '^', expecting ']'

If I take out the ^, it fails again on the W.

like image 484
scubabbl Avatar asked Sep 26 '08 11:09

scubabbl


People also ask

What does GSUB stand for in Ruby?

gsub! is a String class method in Ruby which is used to return a copy of the given string with all occurrences of pattern substituted for the second argument. If no substitutions were performed, then it will return nil. If no block and no replacement is given, an enumerator is returned instead.

Does GSUB use regex?

Regular expressions (shortened to regex) are used to operate on patterns found in strings. They can find, replace, or remove certain parts of strings depending on what you tell them to do.

What does GSUB return?

gsub (s, pattern, repl [, n]) Returns a copy of s in which all (or the first n , if given) occurrences of the pattern have been replaced by a replacement string specified by repl , which can be a string, a table, or a function. gsub also returns, as its second value, the total number of matches that occurred.

What is =~ in Ruby?

=~ is Ruby's basic pattern-matching operator. When one operand is a regular expression and the other is a string then the regular expression is used as a pattern to match against the string. (This operator is equivalently defined by Regexp and String so the order of String and Regexp do not matter.


2 Answers

>> text = "I love spaces"
=> "I love spaces"
>> text.gsub(/\s/, "-").gsub(/[^\W-]/, '').downcase
=> "--"

Missing //

Although this makes a little more sense :-)

>> text.gsub(/\s/, "-").gsub(/([^\W-])/, '\1').downcase
=> "i-love-spaces"

And this is probably what is meant

>> text.gsub(/\s/, "-").gsub(/[^\w-]/, '').downcase
=> "i-love-spaces"

\W means "not a word" \w means "a word"

The // generate a regexp object

/[^\W-]/.class => Regexp

like image 110
Vinko Vrsalovic Avatar answered Sep 19 '22 19:09

Vinko Vrsalovic


Step 1: Add this to your bookmarks. Whenever I need to look up regexes, it's my first stop

Step 2: Let's walk through your code

text.gsub(/\s/, "-")

You're calling the gsub function, and giving it 2 parameters.
The first parameter is /\s/, which is ruby for "create a new regexp containing \s (the // are like special "" for regexes).
The second parameter is the string "-".

This will therefore replace all whitespace characters with hyphens. So far, so good.

.gsub([^\W-], '').downcase

Next you call gsub again, passing it 2 parameters. The first parameter is [^\W-]. Because we didn't quote it in forward-slashes, ruby will literally try run that code. [] creates an array, then it tries to put ^\W- into the array, which is not valid code, so it breaks.
Changing it to /[^\W-]/ gives us a valid regex.

Looking at the regex, the [] says 'match any character in this group. The group contains \W (which means non-word character) and -, so the regex should match any non-word character, or any hyphen.

As the second thing you pass to gsub is an empty string, it should end up replacing all the non-word characters and hyphens with empty string (thereby stripping them out )

.downcase

Which just converts the string to lower case.

Hope this helps :-)

like image 42
Orion Edwards Avatar answered Sep 18 '22 19:09

Orion Edwards