Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Allowing only certain HTML tags as user input

My site allows site-users to write blog-posts

class BlogPost  
{  
 [AllowHtml]  
 public string Content;  
}

The site is created using a MVC5 Internet application template and uses bootstrap 3 for it's CSS. So I decided to use http://jhollingworth.github.io/bootstrap-wysihtml5 to take care of all the JavaScript Part of a Rich Text Editor.

It works like a charm. But in order to make the POST happen, I had to add the [AllowHtml] attribute as in the code above. So now I'm scared of dangerous stuff that can get into the database and be in-turn displayed to all users.

I tried giving values like <script>alert("What's up?")</script> etc in the form and it seemed to be fine... the text was displayed exactly the same way (<script> became &lt;script&gt;. But this conversion seemed to be done by the javascript plugin I used.

So I used fiddler to compose a POST request with the same script tag and this time, the page actually executed the JavaScript code.

Is there any way I can figure out vulnerable input like <script> and even <a href="javascript:some_code">Link</a>...?

like image 203
galdin Avatar asked Jan 27 '26 03:01

galdin


1 Answers

Unfortunately, you have to sanitize the HTML yourself. See these on how people did it:

  1. How to sanitize input from MCE in ASP.NET? - whitelist using Html Agility Pack
  2. .NET HTML Sanitation for rich HTML Input - blacklist using Html Agility Pack

An alternative to accepting HTML is to accept markdown or BBCode instead. Both of them are widely used (markdown is used by stackoverflow!) and eliminate the need to sanitize the input. There are rich editors available too.

Edit

I found that Microsoft Web Protection Library can sanitize HTML input through AntiXss.GetSafeHtml and AntiXss.GetSafeHtmlFragment. Documentation is really poor though and seems like you can't configure which tags are valid.

like image 65
LostInComputer Avatar answered Jan 29 '26 19:01

LostInComputer