Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prevent JSON pretty_generate from escaping Unicode

Is there any way to prevent Ruby's JSON.pretty_generate() method from escaping a Unicode character?

I have a JSON object as follows:

my_hash = {"my_str" : "\u0423"};

Running JSON.pretty_generate(my_hash) returns the value as being \\u0423.

Is there any way to prevent this behaviour?

like image 504
Max Power Avatar asked Jun 27 '11 21:06

Max Power


2 Answers

In your question you have a string of 6 unicode characters "\", "u", "0", "4", "2", "3" (my_hash = { "my_str" => '\u0423' }), not a string consisting of 1 "У" character ("\u0423", note double quotes).

According to RFC 4627, paragraph 2.5, backslash character in JSON string must be escaped, this is why your get double backslash from JSON.pretty_generate.

Alternatively, there are two-character sequence escape
representations of some popular characters. So, for example, a
string containing only a single reverse solidus character may be
represented more compactly as "\\".

char = unescaped /
       escape (...
           %x5C /          ; \    reverse solidus U+005C

escape = %x5C              ; \

Thus JSON ruby gem escape this character internally and there is no way to alter this behavior by parametrizing JSON or JSON.pretty_generate.

If you are interested in JSON gem implementation details - it defines internal mapping hash with explicit mapping of '' char:

module JSON
    MAP = {
        ...
        '\\'  =>  '\\\\'

I took this code from a pure ruby variant of JSON gem gem install json_pure (note that there are also C extension variant that is distributed by gem install json).

Conclusion: If you need to unescape backslash after JSON genaration you need to implement it in your application logic, like in the code above:

my_hash = { "my_str" => '\u0423' }
# => {"my_str"=>"\\u0423"}

json = JSON.pretty_generate(my_hash)
# => "{\n  \"my_str\": \"\\\\u0423\"\n}"

res = json.gsub "\\\\", "\\"
# => "{\n  \"my_str\": \"\\u0423\"\n}"

Hope this helps!

like image 55
Aliaksei Kliuchnikau Avatar answered Nov 06 '22 18:11

Aliaksei Kliuchnikau


Usually, hashes declared using rocket => rather than colon :. Also, there is alternative syntax for symbol-keyed hashes since 1.9: my_hash = {my_str: "\u0423"}. In this case, :my_str would be the key.

Anyway, on my computer JSON.pretty_generate works as expected:

irb(main):002:0> my_hash = {"my_str" => "\u0423"}
=> {"my_str"=>"У"}
irb(main):003:0> puts JSON.pretty_generate(my_hash)
{
  "my_str": "У"
}
=> nil

Ruby 1.9.2p290, (built-in) json 1.4.2.

like image 2
DNNX Avatar answered Nov 06 '22 20:11

DNNX