Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clean user HTML in .net

My C# site allows users to submit HTML to be displayed on the site. I would like to limit the tags and attributes allowed for the HTML, but am unable to figure out how to do this in .net.

I've tried using Html Agility Pack, but I don't see how to modify the HTML, I can see how to go through the HTML and find certain data, but actually generating an output file is baffling me.

Does anyone have a good example for cleaning up HTML in .net? The agility pack might be the answer, but the documentation is lacking.

like image 430
spaetzel Avatar asked Jan 06 '10 15:01

spaetzel


3 Answers

I would strongly recommend Microsoft's Anti-XSS Library for santizing input. It supports sanitizing html.

like image 156
David Avatar answered Nov 01 '22 18:11

David


You should only accept well-formed HTML.

You can then use LINQ to XML to parse and modify it.

You can make a recursive function that takes an element from the user and returns a new element with a whitelisted set of tags and attributes.

For example:

//Maps allowed tags to allowed attributes for the tags.
static readonly Dictionary<string, string[]> AllowedTags = new Dictionary<string, string[]>(StringComparer.OrdinalIgnoreCase) {
    { "b",    new string[0] },
    { "img",  new string[] { "src", "alt" } },
    //...
};
static XElement CleanElement(XElement dirtyElement) {
    return new XElement(dirtyElem.Name,
        dirtyElement.Elements
            .Where(e => AllowedTags.ContainsKey(e.Name))
            .Select<XElement, XElement>(CleanElement)
            .Concat(
                dirtyElement.Attributes
                    .Where(a => AllowedTags[dirtyElem.Name].Contains(a.Name, StringComparer.OrdinalIgnoreCase))
            );
}

If you allow hyperlinks, make sure to disallow javascript: urls; this code doesn't do that.

like image 42
SLaks Avatar answered Nov 01 '22 19:11

SLaks


With HtmlAgilityPack you can remove unwanted tags from the input:

node.ParentNode.RemoveChild(node);
like image 23
morsanu Avatar answered Nov 01 '22 20:11

morsanu