Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UTF8 MySQL problems on Rails - encoding issues with utf8_general_ci

I have a staging Rails site up that's running on MySQL 5.0.32-Debian.

On this particular site, all of my tables are using utf8 / utf8_general_ci encoding.

Inside that database, I have some data that looks like so:

mysql> select * from currency_types limit 1,10;
+------+-----------------+---------+
| code | name            | symbol  |
+------+-----------------+---------+
| CAD  | Canadian Dollar | $       |
| CNY  | Chinese Yuan    | å…ƒ     |
| EUR  | Euro            | €     |
| GBP  | Pound           | £      |
| INR  | Indian Rupees   | ₨     |
| JPY  | Yen             | ¥      |
| MXN  | Mexican Peso    | $       |
| USD  | US Dollar       | $       |
| PHP  | Philippine Peso | ₱     |
| DKK  | Denmark Kroner  | kr      |
+------+-----------------+---------+

Here's the issue I'm having

On staging (with the db and Rails site running on the debian box), the characters for symbols are appearing correctly when displayed from Rails. For instance, the Chinese Yuan is appearing as 元 in my browser, not å…ƒ as it shows inside the database.

When I download that data to my local OS X development machine and run the db and Rails locally, I see the representation from inside the DB (å…ƒ) on my browser, not the character 元 as I see in staging.

Debugging I've done

I've ensured all headers for Content-Type are coming back as utf8 from each webserver (local, staging).

My local mysql server and the staging server are both setup to use utf8 as the default charset. I'm using "set names 'utf8'" before I make any calls.

I can even connect to my staging db from my OS X Rails host, and I still see the characters å…ƒ representing the yuan. I'm guessing then, perhaps there's an issue with my mysql local client, but I can't figure out what the issue is.

Perhaps this might lend a clue

To make it even more confusing, if I paste the character 元 into the db on my local machine, I see that in the web browser fine. --- YET if I paste that same character into my staging db, I get a ? mark in it's place on the page from my staging Rails site.

Also, locally on my OS X rails machine if I use "set names 'latin1'" before my queries, the characters all come back properly. I did have these tables set as latin1 before - could this be the issue?

Someone please help me out here, I'm going crazy trying to figure out what's wrong!

like image 387
Subimage Avatar asked Dec 06 '08 08:12

Subimage


People also ask

Should I use utf8mb4 or utf8?

The difference between utf8 and utf8mb4 is that the former can only store 3 byte characters, while the latter can store 4 byte characters. In Unicode terms, utf8 can only store characters in the Basic Multilingual Plane, while utf8mb4 can store any Unicode character.

What encoding does MySQL use?

MySQL supports multiple Unicode character sets: utf8mb4 : A UTF-8 encoding of the Unicode character set using one to four bytes per character.


2 Answers

Do you have these two lines in your database.yml under the proper section?

encoding: utf8
collation: utf8_general_ci
like image 35
Can Berk Güder Avatar answered Oct 13 '22 11:10

Can Berk Güder


AHA! Seems I had some table information encoded in latin1 before, and stupidly changed the databases to utf8 without converting.

Running the following fixed that currency_types table:

mysqldump -u root -p --opt --default-character-set=latin1 --skip-set-charset  DBNAME > DBNAME.sql

mysql -u root -p --default-character-set=utf8  DBNAME < DBNAME.sql

Now I just have to ensure that the other content generated after the latin1 > utf8 switch isn't messed up by that :(

like image 119
Subimage Avatar answered Oct 13 '22 11:10

Subimage