I have JavaScript application, where I use client-side templates (underscore.js, Backbone.js).
Data for initial page load is strapped into the page like this (.cshtml Razor-file):
<div id="model">@Json.Encode(Model)</div>
Razor engine performs escaping, so, if the Model is
new { Title = "<script>alert('XSS');</script>" }
, in output we have:
<div id="model">{"Title":"\u003cscript\u003ealert(\u0027XSS\u0027)\u003c/script\u003e"}</div>
Which after "parse" operation:
var data = JSON.parse($("#model").html());
we have object data with "Title" field exactly "<script>alert('XSS');</script>"!
When this goes to underscore template, it alerts.
Somehow \u003c-like symbols are treated like proper "<" symbols.
How do I escape "<" symbols to < and > from DB (if they somehow got there)?
Maybe I can tune Json.Encode serialization for escaping these symbols?
Maybe I can set up Entity Framework which I`m using, for automatically escape these symbols absolutely all the time when getting data from DB?
\u003c and similar codes are perfectly valid for JS. You can obfuscate whole JS files using this syntax, if you so choose. Essentially, you're seeing an escape character \, u for unicode, and then a 4-character Hex code which relates to a symbol.
http://javascript.about.com/library/blunicode.htm
\u003c - as you've noted, is the < character.
One approach to "fixing" this on the MVC side would be to write a RegEx which looks for the pattern \u - and then captures the next 4 characters. You could then un-encode them into actual unicode characters - and run the resultant text through your XSS prevention algorithms.
As you've noted in your question - just looking for "<" doesn't help. You also can't just look for "\u003cscript" - because this assumes the potential hacker hasn't simply unicode-encoded the entire "script" tag word. The safer approach is to un-escape all of these kinds of codes and then cleanse your HTML in plain-text.
Incidentally, it might make you feel better to note that this is one of the common (and thusfar poorly resolved) issues in XSS prevention. So you aren't alone in wanting a better solution...
You might check out the following libraries to assist in the actual html cleansing:
http://wpl.codeplex.com/ (Microsoft's attempt at a solution - though very bad user feedback) https://www.owasp.org/index.php/Category:OWASP_AntiSamy_Project_.NET (A private project which is designed to do a lot of this kind of prevention. I find it hard to use, and poorly implemented in .NET)
Both are good references, though.
You need to encode your string as HTML before providing it to Underscore.
"HTML escaping in Underscore.js templates" explains how to do this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With