Can someone clarify Gson's unicode encoding?

Question

In the following minimalistic example:

import com.google.gson.Gson; import com.google.gson.GsonBuilder;  public class GsonStuff {      public static void main(String[] args) {         GsonBuilder builder = new GsonBuilder();         Gson gson = builder.create();         System.out.println(gson.toJson("Apostrophe: '"));         //Outputs: "Apostrophe: \u0027"     }    }

The apostrophe gets replaced by it's unicode representation in the printout. However, the String returned from the toJson method literally has the characters '\', 'u', '0', '0', '2', '7'.

Decoding it with json actually works and gives the string "Apostrophe: '" as opposed to "Apostrophe: \u0027". How should I decode it to get the same result?

And an additional question, why doesn't a random unicode character such as ش get encoded similarly?

Gustav Barkefors · Accepted Answer

By default, gson Unicode escapes certain characters, of which ' is one. (See HTML_SAFE_REPLACEMENT_CHARS in JsonWriter for the complete list.)

To disable this, do

builder.disableHtmlEscaping();

Can someone clarify Gson's unicode encoding?

Tags:

java

unicode

gson

Miquel

1 Answers

Gustav Barkefors

Recent Activity

Donate For Us

Can someone clarify Gson's unicode encoding?

Tags:

java

unicode

gson

Miquel

1 Answers

Gustav Barkefors

Related questions

Recent Activity

Donate For Us