Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prevent XSS attacks and still use Html.Raw

I have CMS system where I am using CK Editor to enter data. Now if user types in <script>alert('This is a bad script, data');</script> then CKEditor does the fair job and encodes it correctly and passes &lt;script&gt;alert(&#39;This is a bad script, data&#39;)&lt;/script&gt; to server.

But if user goes into browser developer tools (using Inspect element) and adds this inside it as shown in the below screen shot then this is when all the trouble starts. Now after retrieving back from DB when this is displayed in Browser it presents alert box.

Edit CKEditor contents thru inspect element

So far I have tried many different things one them is

  • Encode the contents using AntiXssEncoder [HttpUtility.HtmlEncode(Contents)] and then store it in database and when displaying back in browser decode it and display it using MvcHtmlString.Create [MvcHtmlString.Create(HttpUtility.HtmlDecode(Contents))] or Html.Raw [Html.Raw(Contents)] as you may expect both of them displays JavaScript alert.

I don't want to replace the <script> manually thru code as it is not comprehensive solution (search for "And the encoded state:").

So far I have referred many articles (sorry not listing them all here but just adding few as proof to show I have put sincere efforts before writing this question) but none of them have code which shows the answer. May be there is some easy answer and I am not looking in right direction or may be it is not that simple at all and I may need to use something like Content Security Policy.

ASP.Net MVC Html.Raw with AntiXSS protection Is there a risk in using @Html.Raw? http://blog.simontimms.com/2013/01/21/content-security-policy-for-asp-net-mvc/ http://blog.michaelckennedy.net/2012/10/15/understanding-text-encoding-in-asp-net-mvc/

To reproduce what I am saying go to *this url and in the text box type <script>alert('This is a bad script, data');</script> and click the button.

*This link is from Michael Kennedy's blog

like image 755
ndd Avatar asked Jul 16 '15 20:07

ndd


1 Answers

It isn't easy and you probably don't want to do this. May I suggest you use a simpler language than HTML for end user formatted input? What about Markdown which (I believe) is used by Stackoverflow. Or one of the existing Wiki or other lightweight markup languages?

If you do allow Html, I would suggest the following:

  • only support a fixed subset of Html
  • after the user submits content, parse the Html and filter it against a whitelist of allowed tags and attributes.
  • be ruthless in filtering and eliminating anything that you aren't sure about.

There are existing tools and libraries that do this. I haven't used it, but I did stumble on http://htmlpurifier.org/. I assume there are many others. Rick Strahl has posted one example for .NET, but I'm not sure if it is complete.

About ten years ago I attempted to write my own whitelist filter. It parsed and normalized the entered Html. Then it removed any elements or attributes that were not on the allowed whitelist. It worked pretty well, but you never know what vulnerabilities you've missed. That project is long dead, but if I had to do it over I would have used an existing simpler markup language rather than Html.

There are so many ways for users to inject nasty stuff into your pages, you have to be fierce to prevent this. Even CSS can be used to inject executable expressions into your page, like:

<STYLE type="text/css">BODY{background:url("javascript:alert('XSS')")}</STYLE>

Here is a page with a list of known attacks that will keep you up at night. If you can't filter and prevent all of these, you aren't ready for untrusted users to post formatted content viewable by the public.

Right around the time I was working on my own filter, MySpace (wow I'm old) was hit by an XSS Worm known as Samy. Samy used Style attributes with embedded background Url that had a javascript payload. It is all explained by the author.

Note that your example page says:

This page is meant to accept and display raw HTML by trusted editors.

The key issue here is trust. If all of your users are trusted (say employees of a web site), then the risk here is lower. However, if you are building a forum or social network or dating site or anything that allows untrusted users to enter formatted content that will be viewable by others, you have a difficult job to sanitize Html.

like image 169
Michael Levy Avatar answered Oct 23 '22 03:10

Michael Levy