We are using Jsoup to parse, manipulate and extend a html template. So far everything works fine until it comes to single quotes used in combination with HTML attributes
<span data-attr='JSON'></span>
That HTML snippet is converted to
<span data-attr="JSON"></span>
which will conflict with the inner json data which is specified as valid with double quotes only
{"param" : "value"} //valid
{'param' : 'value'} //invalid
so we need to force Jsoup to NOT change those single quotes to double quotes, but how? Currently that is our code to parse and produce html content.
pageTemplate = Jsoup.parse(new File(mainTemplateFilePath), "UTF-8");
pageTemplate.outputSettings().escapeMode(Entities.EscapeMode.xhtml);
pageTemplate.outputSettings().charset("UTF-8");
... adding some html 
pageTemplate.html(); // will output the double quoted attributes :(
                You need to HTML encode the JSON value before putting it into the data-attr attribute.  When you do so, you should end up with this:
<span data-attr="{"param":"value"}"></span>
Although that looks fairly daunting, it is actually valid HTML.  When your corresponding JavaScript executes someSpan.getAttribute("data-attr"), the " values will be transformed into " values automatically, giving you access to the original valid JSON string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With