Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTML Sanitizer for .NET

I'm starting a project that will be public facing using asp.net mvc. I know there are about a billion php, python, and ruby html sanitizers out there, but does anyone have some pointers to anything good in .net? What are your experiences with what is out there? I know stackoverflow is a site done in asp.net that allows freeform HTML, what does it use?

like image 472
Matt Briggs Avatar asked Dec 04 '08 20:12

Matt Briggs


People also ask

What is Sanitizer HTML?

In data sanitization, HTML sanitization is the process of examining an HTML document and producing a new HTML document that preserves only whatever tags are designated "safe" and desired.

How does react sanitize HTML?

sanitize-html allows you to specify the tags you want to permit, and the permitted attributes for each of those tags. If a tag is not permitted, the contents of the tag are still kept, except for script , style and textarea tags. The syntax of poorly closed p and img elements is cleaned up.

What is Sanitizer API in Chrome?

The Sanitizer interface of the HTML Sanitizer API provides methods to sanitize untrusted strings of HTML, Document and DocumentFragment objects. After sanitization, unwanted elements or attributes are removed, and the returned objects can safely be inserted into a document's DOM.

What is angular sanitize used for?

Sanitizer is used by the views to sanitize potentially dangerous values.


4 Answers

HtmlSanitizer

Source: https://github.com/mganss/HtmlSanitizer

A fairly robust sanitizer. It understands and can clean inline styles, but doesn't have a parser that can deal with <style> blocks, so it strips them. It's certainly up to and probably beyond the level that Microsoft's AntiXSS was at, before it was abandoned.

like image 115
pattermeister Avatar answered Sep 20 '22 12:09

pattermeister


https://blog.stackoverflow.com/2008/06/safe-html-and-xss/

like image 43
kokos Avatar answered Sep 20 '22 12:09

kokos


HtmlRuleSanitizer

Based on your question I have the following suggestions:

  • You want to allow free form HTML, so you need a solution to be able to specify a set of tags, attributes and/or CSS classes which are allowed.
  • By allowing free form HTML it is likely that you'll also have to deal with malformed HTML because users make errors (deliberate or not). You thus need a solution built on a tolerant parser such as the Html Agility Pack.
  • You'll want to take a white listing approach because a black listing sanitizer does not protect your from any new HTML specifications. In addition it is very hard to guarantee that a black list covers all cases in the first place due to the size of the HTML specification.

I faced the same problem and built HtmlRuleSanitizer which is a white listing rule based HTML sanitizer on top of the Html Agility Pack.

like image 43
Christ A Avatar answered Sep 17 '22 12:09

Christ A


there is a c# version here

like image 35
Ashok Padmanabhan Avatar answered Sep 19 '22 12:09

Ashok Padmanabhan