Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sanitize HTML code in Java to prevent XSS attacks?

I'm looking for class/util etc. to sanitize HTML code i.e. remove dangerous tags, attributes and values to avoid XSS and similar attacks.

I get html code from rich text editor (e.g. TinyMCE) but it can be send malicious way around, ommiting TinyMCE validation ("Data submitted form off-site").

Is there anything as simple to use as InputFilter in PHP? Perfect solution I can imagine works like that (assume sanitizer is encapsulated in HtmlSanitizer class):

String unsanitized = "...<...>...";           // some potentially                                                // dangerous html here on input  HtmlSanitizer sat = new HtmlSanitizer();      // sanitizer util class created  String sanitized = sat.sanitize(unsanitized); // voila - sanitized is safe... 

Update - the simpler solution, the better! Small util class with as little external dependencies on other libraries/frameworks as possible - would be best for me.


How about that?

like image 568
WildWezyr Avatar asked Aug 05 '10 09:08

WildWezyr


People also ask

How do you disinfect HTML?

Sanitize a string immediatelysetHTML() is used to sanitize a string of HTML and insert it into the Element with an id of target . The script element is disallowed by the default sanitizer so the alert is removed.

Does HTML encoding prevent XSS?

No. Putting aside the subject of allowing some tags (not really the point of the question), HtmlEncode simply does NOT cover all XSS attacks.

What is XSS sanitization?

Summary. xss-sanitize allows you to accept html from untrusted sources by first filtering it through a white list. The white list filtering is fairly comprehensive, including support for css in style attributes, but there are limitations enumerated below.

What is Owasp Java HTML Sanitizer?

The OWASP HTML Sanitizer is a fast and easy to configure HTML Sanitizer written in Java which lets you include HTML authored by third-parties in your web application while protecting against XSS. The existing dependencies are on guava and JSR 305. The other jars are only needed by the test suite.


2 Answers

You can try OWASP Java HTML Sanitizer. It is very simple to use.

PolicyFactory policy = new HtmlPolicyBuilder()     .allowElements("a")     .allowUrlProtocols("https")     .allowAttributes("href").onElements("a")     .requireRelNofollowOnLinks()     .build();  String safeHTML = policy.sanitize(untrustedHTML); 
like image 95
Saljack Avatar answered Oct 14 '22 17:10

Saljack


You could use OWASP ESAPI for Java, which is a security library that is built to do such operations.

Not only does it have encoders for HTML, it also has encoders to perform JavaScript, CSS and URL encoding. Sample uses of ESAPI can be found in the XSS prevention cheatsheet published by OWASP.

You could use the OWASP AntiSamy project to define a site policy that states what is allowed in user-submitted content. The site policy can be later used to obtain "clean" HTML that is displayed back. You can find a sample TinyMCE policy file on the AntiSamy downloads page.

like image 44
Vineet Reynolds Avatar answered Oct 14 '22 18:10

Vineet Reynolds