Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What options do exist now to implement UTF8 in Ruby and RoR?

Following the development of Ruby very closely I learned that detailed character encoding is implemented in Ruby 1.9. My question for now is: How may Ruby be used at the moment to talk to a database that stores all data in UTF8?

Background: I am involved in a new project where Ruby/RoR is at least an option. But the project needs to rely on an internationalized character set (it's spread over many countries), preferably UTF8.

So how do you deal with that? Thanks in advance.

like image 289
Georgi Avatar asked Nov 06 '22 23:11

Georgi


1 Answers

Ruby 1.8 works fine with UTF-8 strings for basic operations with the strings. Depending on your application's need, some operations will either not work or not work as expected.

Eg:

1) The size of strings will give you bytes, not characters since the mult-byte support is not there yet. But do you need to know the size of your strings in characters?

2) No splitting a string at a character boundary. But do you need this? Etc.

3) Sorting order will be funky if sorted in Ruby. The suggestion of using the db to sort is a good idea.

etc.

Re poster's comment about sorting data after reading from db: As noted, results will probably not match users' expectations. So the solution is to sort on the db. And it will usually be faster, anyhow--databases are designed to sort data.

Summary: My Ruby 1.8.6 RoR app works fine with international Unicode characters processed and stored as UTF-8 on modern browsers. Right to left languages work fine too. Main issues: be sure that your db and all web pages are set to use UTF-8. If you already have some data in your db, then you'll need to go through a conversion process to change it to UTF-8.

Regards,

Larry

like image 103
Larry K Avatar answered Nov 11 '22 06:11

Larry K