What's the best library/approach for removing Javascript from HTML that will be displayed?
For example, take:
<html><body><span onmousemove='doBadXss()'>test</span></body></html>
and leave:
<html><body><span>test</span></body></html>
I see the DeXSS project. But is that the best way to go?
JSoup has a simple method for sanitizing HTML based on a whitelist. Check http://jsoup.org/cookbook/cleaning-html/whitelist-sanitizer
It uses a whitelist, which is safer then the blacklist approach DeXSS uses. From the DeXSS page:
There are still a number of known XSS attacks that DeXSS does not yet detect.
A blacklist only disallows known unsafe constructions, while a whitelist only allows known safe constructions. So unknown, possibly unsafe constructions will only be protected against with a whitelist.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With