In my database, I have the following entry
id | name | info
1 John Smith Çö ¿¬¼
As you can tell, the info column displays wrong -- it's actually Korean, though. In Chrome, when I switch the browser encoding from UTF-8 to Korean ('euc-kr', I think), I actually manage to view the text as such:
id | name | info
1 John Smith 횉철 쩔짭쩌
I then manually copy the text into the info in the database and save, and now I can view it in UTF-8, without switching my browser's encoding.
Awesome. Now I'd like to get that same thing done in Rails, not manually. So starting with the original entry again, I go to the console and type:
require 'iconv'
u = User.find(1)
info = u.info
new_info = Iconv.iconv('euc-kr','UTF-8', info)
u.update_attribute('info', new_info)
However, what I end up with is something resembling \x{A2AF}\x{A8FA}\x{A1C6} \x{A2A5}\x{A8A2} in the database, not 횉철 쩔짭쩌.
I have a very basic understanding of unicode and encoding.
Can someone please explain what's going on here and how to get around that? The desired result is what I achieved manually.
Thanks!
Wow. I'm beating myself over the head now. After hours of trying to resolve this, I finally figured it out myself a few minutes after I posted a question here.
The solution consists of three simple steps:
STEP 1:
I almost had it right. I shouldn't be converting from euc-kr to utf-8, but the other way around, as such:
Iconv.iconv('UTF-8', 'euc-kr', info)
STEP 2:
I might still run into some errors in the text, so to be safe I tell Iconv to ignore any errors:
Iconv.iconv('UTF-8//IGNORE', 'euc-kr', info)
Finally, I actually get REAL KOREAN TEXT, yay! The problem is, when I try to insert it into the database, it's still inserting something along the lines of:
UPDATE `users` SET `info` = '--- \n- \"\\xEC\\xB2\\xA0\\xEC\\xB1\\x8C...' etc...
Even though it turns out I have the right text. So why is that? Onto the last step.
STEP 3:
Turns out the output from Iconv is an array. And so, we merge it with join:
Iconv.iconv('UTF-8//IGNORE', 'euc-kr', info).join
And this actually works!
The final code:
require 'iconv'
u = User.find(1)
info = u.info
new_info = Iconv.iconv('UTF-8//IGNORE','euc-kr', info).join
u.update_attribute('info', new_info)
Hope this helps whomever sees this (and knowing myself, probably future me).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With