Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple way for removing all non word characters

Tags:

regex

ruby

I'd like to remove all characters from string, using most simple way. For example from "a,sd3 31ds" to "asdds" I cad do it something like this:

"a,sd3 31ds".gsub(/\W/, "").gsub(/\d/,"")
# => "asdds"

but it looks a little bit awkward. Maybe it is possible to merge these rexegs in one?

like image 854
evfwcqcg Avatar asked Sep 22 '11 08:09

evfwcqcg


2 Answers

"a,sd3 31ds".gsub(/(\W|\d)/, "")
like image 166
Tudor Constantin Avatar answered Oct 20 '22 00:10

Tudor Constantin


I would go for the regexp /[\W\d]+/. It is potentially faster than e.g. /(\W|\d)/.

require 'benchmark' 

N = 500_000
Regexps = [ "(\\W|\\d)", "(\\W|\\d)+", "(?:\\W|\\d)", "(?:\\W|\\d)+", 
            "\\W|\\d", "[\\W\\d]", "[\\W\\d]+" ]

Benchmark.bm(15) do |x|  
  Regexps.each do | re_str |
    re = Regexp.new(re_str)
    x.report("/#{re_str}/:") { N.times { "a,sd3 31ds".gsub(re, "") }}
  end
end   

gives (with ruby 2.0.0p195 [x64-mingw32])

                      user     system      total        real
/(\W|\d)/:        1.950000   0.000000   1.950000 (  1.951437)
/(\W|\d)+/:       1.794000   0.000000   1.794000 (  1.787569)
/(?:\W|\d)/:      1.857000   0.000000   1.857000 (  1.855515)
/(?:\W|\d)+/:     1.638000   0.000000   1.638000 (  1.626698)
/\W|\d/:          1.856000   0.000000   1.856000 (  1.865506)
/[\W\d]/:         1.732000   0.000000   1.732000 (  1.754596)
/[\W\d]+/:        1.622000   0.000000   1.622000 (  1.617705)
like image 24
undur_gongor Avatar answered Oct 20 '22 00:10

undur_gongor