In my controller, the following works (prints "oké")
puts obj.inspect
But this doesn't (renders "ok\u00e9")
render :json => obj
Apparently the to_json
method escapes unicode characters. Is there an option to prevent this?
To set the \uXXXX codes back to utf-8:
json_string.gsub!(/\\u([0-9a-z]{4})/) {|s| [$1.to_i(16)].pack("U")}
You can prevent it by monkey patching the method mentioned by muu is too short. Put the following into config/initializers/patches.rb (or similar file used for patching stuff) and restart your rails process for the change to take affect.
module ActiveSupport::JSON::Encoding
class << self
def escape(string)
if string.respond_to?(:force_encoding)
string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)
end
json = string.gsub(escape_regex) { |s| ESCAPED_CHARS[s] }
json = %("#{json}")
json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
json
end
end
end
Be adviced that there's no guarantee that the patch will work with future versions of ActiveSupport. The version used when writing this post is 3.1.3.
If you dig through the source you'll eventually come to ActiveSupport::JSON::Encoding
and the escape
method:
def escape(string)
if string.respond_to?(:force_encoding)
string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)
end
json = string.
gsub(escape_regex) { |s| ESCAPED_CHARS[s] }.
gsub(/([\xC0-\xDF][\x80-\xBF]|
[\xE0-\xEF][\x80-\xBF]{2}|
[\xF0-\xF7][\x80-\xBF]{3})+/nx) { |s|
s.unpack("U*").pack("n*").unpack("H*")[0].gsub(/.{4}/n, '\\\\u\&')
}
json = %("#{json}")
json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
json
end
The various gsub
calls are forcing non-ASCII UTF-8 to the \uXXXX
notation that you're seeing. Hex encoded UTF-8 should be acceptable to anything that processes JSON but you could always post-process the JSON (or monkey patch in a modified JSON escaper) to convert the \uXXXX
notation to raw UTF-8 if necessary.
I'd agree that forcing JSON to be 7bit-clean is a bit bogus but there you go.
Short answer: no.
Characters were not escaped to unicode with the other methods in Rails2.3.11/Ruby1.8
so I used the following:
render :json => JSON::dump(obj)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With