I am using the OWASP Html Sanitizer to prevent XSS attacks on my web app. For many fields that should be plain text the Sanitizer is doing more than I expect.
For example:
HtmlPolicyBuilder htmlPolicyBuilder = new HtmlPolicyBuilder();
stripAllTagsPolicy = htmlPolicyBuilder.toFactory();
stripAllTagsPolicy.sanitize('a+b'); // return a+b
stripAllTagsPolicy.sanitize('[email protected]'); // return foo@example.com
When I have fields such as email address that have a + in it such as [email protected] I end up with the wrong data in the the database. So two questions: 
+ - @ dangerous on their own do they really need to be encoded? Question 2 is the more important one for me to get an answer to.
Sanitize a string immediatelysetHTML() is used to sanitize a string of HTML and insert it into the Element with an id of target . The script element is disallowed by the default sanitizer so the alert is removed.
HtmlPolicyBuilder is fast and easy to configure HTML Sanitizer which lets you include HTML authored by third-parties in your web application while protecting against XSS. You can read more about the underlying implementation here.
The sanitize() method of the Sanitizer interface is used to sanitize a tree of DOM nodes, removing any unwanted elements or attributes. It should be used when the data to be sanitized is already available as DOM nodes. For example when sanitizing a Document instance in a frame.
You may want to use ESAPI API to filter specific characters. Although if you like to allow specific HTML element or attribute you can use following allowElements and allowAttributes.
// Define the policy.
Function<HtmlStreamEventReceiver, HtmlSanitizer.Policy> policy
     = new HtmlPolicyBuilder()
         .allowElements("a", "p")
         .allowAttributes("href").onElements("a")
         .toFactory();
 // Sanitize your output.
 HtmlSanitizer.sanitize(myHtml, policy.apply(myHtmlStreamRenderer));
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With