Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find non ascii strings in an array of strings, in Rails 2.0/ruby 1.8.6?

I have an array full of user logins that was loaded from the database. What's the simplest and efficient way to keep only the logins that contain non-ascii characters?

logins = Users.find(:all).map{|user|user.login}
logins_with_non_ascii_characters = logins.select{ |login| ...??? }

Thanks

Edit: if you have a SQL solution (I use MySQL, but a generic solution would be better) to filter out the logins directly on the first line, with a :conditions clause, I'm ok with that too. In fact, it would be way more efficient:

logins = Users.find(:all, :conditions => "...???").map{|user|user.login}
like image 546
MiniQuark Avatar asked Dec 17 '22 07:12

MiniQuark


2 Answers

You can abuse Ruby's built in regular expression character classes for this

[:print:] contains all ASCII printable characters. It doesn't contain ASCII characters like beeps or, importantly, multibyte characters.

Working on the assumption that your users are unlikely to have ASCII BEEP as a character in their password,

#reject if has non-ascii character
valid_users = users.reject! {|user| user.login =~ /[^[:print:]]/} 

should do it for you.

like image 153
Patrick McKenzie Avatar answered Dec 19 '22 19:12

Patrick McKenzie


All I have found so far is this:

def is_ascii(str)
    str.each_byte {|c| return false if c>=128}
    true
end

logins = Users.find(:all).map{|user|user.login}
logins_with_non_ascii_characters = logins.select{ |login| not is_ascii(login) }

It's a bit disappointing, and certainly not efficient. Anyone got a better idea?

like image 43
MiniQuark Avatar answered Dec 19 '22 21:12

MiniQuark