I noticed that GSON HTML-escapes <
and >
characters and this can be disabled by using disableHtmlEscaping()
builder configuration method. But my question is - why GSON does HTML-escaping by default? What are the risks of not HTML-escaping anything?
Thanks.
Actually, the disableHtmlEscaping()
method tells Gson not to escape HTML characters such as <
, >
, &
, =
, and '
.
An example in which a single quote which cause trouble: rendering unescaped JSON in a <script/>
tag in an HTML page without using an additional <![CDATA[ ... ]]>
tag.
Joel Leitch wrote a great response to a similar question. Here are the highlights:
Characters such as <, >, =, etc. are escaped because if the JSON string evaluated by Gson is embedded in an XHTML page then we do not know what characters are actually wrapping this JSON string. Therefore, if there was an open quote, then the embedded JSON followed by a closing quote then we do not know what will happen. Maybe if the Gson string contains a abc=123 and there happens to be a "var abc" defined then the embedded the Gson output in the page may cause the abc JavaScript variable to be assigned the value 123. The same thing can happen with < and > or even &.
As for the whitespace escaping, \t is an escaped character for a tab. Likewise, \n and \r are escape characters for newlines and carriage returns. Escaping whitespace like this should ensure that any editor will show the proper whitespace (if the editor properly evaluates these escaped characters).
The Escaper and JsonWriter classes contain more information on the complete set of characters escaped by Gson.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With