Is there a comprehensive Html cleaner/Anti-Xss library for .NET that also has a defined whitelist. I know that Microsofts Anti-Xss is a good place to start, but it needs a good whitelist for allowed html tags and css. Does anyone know of something?
What's wrong with Microsoft's Anti-XSS library (which you've mentioned)?
They've got comprehensive HTML sanitizing that filters the characters based on a white list, parses the HTML, filters the nodes based on a white-list, and then regenerates the (safe) HTML. You can change the white lists (since the code is open), but I'm not sure you'd want to.
Usage is simple too:
var sanitizedHtml = Microsoft.Security.Application.Sanitizer.GetSafeHtmlFragment(inputHtml);
According to MSDN (see "Allowing Restricted HTML Input") the best way to sanitize HTML input is to call HttpUtility.HtmlEncode() on your input and then selectively replace the encoding on all your whitelist tags like so:
<%@ Page Language="C#" ValidateRequest="false"%>
<script runat="server">
void submitBtn_Click(object sender, EventArgs e)
{
// Encode the string input
StringBuilder sb = new StringBuilder(
HttpUtility.HtmlEncode(htmlInputTxt.Text));
// Selectively allow and <i>
sb.Replace("<b>", "<b>");
sb.Replace("</b>", "");
sb.Replace("<i>", "<i>");
sb.Replace("</i>", "");
Response.Write(sb.ToString());
}
</script>
See also this article.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With