EDIT: For future reference, I'm using non-xhtml content type definition <!html>
I'm creating a website using Django, and I'm trying to embed arbitrary json data in my pages to be used by client-side javascript code.
Let's say my json object is {"foo": "</script>"}
. If I embed this directly,
<script type='text/javascript'>JSON={"foo": "</script>"};</script>
The first closes the json object. (also, it will make the site vulnerable to XSS, since this json object will be dynamically generated).
If I use django's HTML escape function, the resulting output is:
<script type='text/javascript'>JSON={"foo": "</script>"};</script>
and the browser cannot interpret the <script>
tag.
The question I have here is,
For a JSON attribute of a simple type, it converts the value to text data within the element. Embedded JSON objects are converted to embedded XML elements.
JSON can very easily be translated into JavaScript. JavaScript can be used to make HTML in your web pages.
The <script> tag can be placed in the <head> section of your HTML or in the <body> section, depending on when you want the JavaScript to load.
If you are using XHTML, you would be able to use entity references (<
, >
, &
) to escape any string you want within <script>
. You would not want to use a <![CDATA[...]]>
section, because the sequence "]]>
" can't be expressed within a CDATA section, and you would have to change the script to express ]]>
.
But you're probably not using XHTML. If you're using regular HTML, the <script>
tag acts somewhat like a CDATA section in XML, except that it has even more pitfalls. It ends with </script>
. There are also arcane rules to allow <!-- document.write("<script>...</script>") -->
(the comments and <script>
opening tag must both be present for </script>
to be passed through). The compromise that the HTML5 editors adopted for future browsers is described in HTML 5 tokenization and CDATA Escapes
I think the takeaway is that you must prevent </script>
from occurring in your JSON, and to be safe you should also avoid <script>
, <!--
, and -->
to prevent runaway comments or script tags. I think it's easiest just to replace <
with \u003c
and -->
with --\>
I tried backslash escaping the forward slash and that seems to work:
<script type='text/javascript'>JSON={"foo": "<\/script>"};</script>
have you tried that?
On a side note, I am surprised that the embedded </script>
tag in a string breaks the javascript. Couldn't believe it at first but tested in Chrome and Firefox.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With