I want to be able to accept HTML from untrusted users and sanitize it so that I can safely include it in pages on my website. By this I mean that markup should not be stripped or escaped, but should be passed through essentially unchanged unless it contains dangerous tags such as <script>
or <iframe>
, dangerous attributes such as onload
, or dangerous CSS properties such as background URLs. (Apparently some older IEs will execute javascript URLs in CSS?)
Serving the content from a different domain, enclosed in an iframe, is not a good option because there is no way to tell in advance how tall the iframe has to be so it will always look ugly for some pages.
I looked into HTML Purifier, but it looks like it doesn't support HTML5 yet. I also looked into Google Caja, but I'm looking for a solution that doesn't use scripts.
Does anyone know of a library that will accomplish this? PHP is preferred, but beggars can't be choosers.
Sanitize a string immediatelysetHTML() is used to sanitize a string of HTML and insert it into the Element with an id of target . The script element is disallowed by the default sanitizer so the alert is removed.
HTML sanitization can be used to protect against attacks such as cross-site scripting (XSS) by sanitizing any HTML code submitted by a user.
HTML sanitization is an OWASP-recommended strategy to prevent XSS vulnerabilities in web applications. HTML sanitization offers a security mechanism to remove unsafe (and potentially malicious) content from untrusted raw HTML strings before presenting them to the user.
This plugin uses OWASP ESAPI library to sanitize request parameters. This reduces the risk of dangerous XSS request parameters possibly being rendered on the client. Owner: rpalcolea | 1.0.0 | Aug 22, 2016 | Package | Issues | Source | License: Apache-2.0.
The black listing approach puts you under upgrade pressure. So each time browsers start to support new standards you MUST draw your sanitizing tool to the same level. Such changes happen more often than you think.
White listing (which is achieved by strip_tags with well defined exceptions) of cause shrinks options for your users, but puts you on the save site.
On my own sites I have the policy to apply the black listing on pages for very trusted users (such as admins) and the whitelisting on all other pages. That sets me into the position to not put much effort into the black listing. With more mature role & permission concepts you can even fine grain your black lists and white lists.
UPDATE: I guess you look for this:
I got the point that strip_tags whitelists on tag level but does accept everything on attribute level. Interestingly HTMLpurifier seems to do the whitelisting on attribute level. Thanks, was a nice learning here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With