I am working on a Rails app.
I am using an API that returns some Chinese provinces. The API returns the provinces in hex strings, for example:
{ "\xE5\x8C\x97\xE4\xBA\xAC" => "some data" }
My JavaScript calls a controller that returns this hash. I put all the province strings into a dropdown but the strings show up as a black diamond with a question mark in the middle. I am wondering how do I convert the Ruby hex string into actual Chinese characters, 北京? Or if possible, can I convert the hex string in JavaScript into Chinese characters?
The bytes \xE5\x8C\x97
are the UTF-8 representation of 北
and \xE4\xBA\xAC
is the UTF-8 representation of 京
. So this string:
"\xE5\x8C\x97\xE4\xBA\xAC"
is 北京
if the bytes are interpreted as UTF-8. That you're seeing hex codes instead of Chinese characters suggests that the string's encoding is binary:
> s = "\xE5\x8C\x97\xE4\xBA\xAC"
=> "北京"
> s.encoding
=> #<Encoding:UTF-8>
> s.force_encoding('binary')
=> "\xE5\x8C\x97\xE4\xBA\xAC"
So this API you're talking to is speaking UTF-8 but somewhere your application is losing track of what encoding that string is supposed to be. If you force the encoding to be UTF-8 then the problem goes away:
> s.force_encoding('utf-8')
=> "北京"
You should fix this encoding problem at the very edge of your application where it reads data from this remote API. Once that's done, everything should be sensible UTF-8 everywhere that you care about. This should fix your JavaScript problem as well as JavaScript is quite happy to work with UTF-8.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With