I have a string input field in a form. I get that value in params hash. How should I remove all characters except alphabets and numbers from that string.
In Ruby, we can permanently delete characters from a string by using the string. delete method. It returns a new string with the specified characters removed.
The chop method is used to remove the last character of a string in Ruby. If the string ends with \r\n , it will remove both the separators. If an empty string calls this method, then an empty string is returned. We can call the chop method on a string twice.
You can read more in Ruby's docs for regular expressions. lookAhead =~ /[[:alnum:]]/ if you just want to check whether the char is alphanumeric without needing to know which.
Just to remind people of good 'ol tr
:
asdf.tr('^A-Za-z0-9', '')
which is finding the complement of the character ranges and translating the characters to ''.
I was curious whether using a \W
character class was faster than ranges and gsub
vs. tr
:
require 'benchmark' asdf = [('A'..'z').to_a, ('0'..'9').to_a].join puts asdf puts asdf.tr( '^A-Za-z0-9', '' ) puts asdf.gsub( /[\W_]+/, '' ) puts asdf.gsub( /\W+/, '' ) puts asdf.gsub( /\W/, '' ) puts asdf.gsub( /[^A-Za-z0-9]+/, '' ) puts asdf.scan(/[a-z\d]/i).join n = 100_000 Benchmark.bm(7) do |x| x.report("tr:") { n.times do; asdf.tr('^A-Za-z0-9', ''); end } x.report("gsub1:") { n.times do; asdf.gsub(/[\W_]+/, ''); end } x.report("gsub2:") { n.times do; asdf.gsub(/\W+/, ''); end } x.report("gsub3:") { n.times do; asdf.gsub(/\W/, ''); end } x.report("gsub4:") { n.times do; asdf.gsub(/[^A-Za-z0-9]+/, ''); end } x.report("scan:") { n.times do; asdf.scan(/[a-z\d]/i).join; end } end >> ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz0123456789 >> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 >> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 >> ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz0123456789 >> ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz0123456789 >> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 >> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 >> user system total real >> tr: 0.560000 0.000000 0.560000 ( 0.557883) >> gsub1: 0.510000 0.000000 0.510000 ( 0.513244) >> gsub2: 0.820000 0.000000 0.820000 ( 0.823816) >> gsub3: 0.960000 0.000000 0.960000 ( 0.955848) >> gsub4: 0.900000 0.000000 0.900000 ( 0.902166) >> scan: 5.630000 0.010000 5.640000 ( 5.630990)
You can see a couple of the patterns aren't catching the '_', which is part of \w
, and, as a result not meeting the OP's request.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With