I am currently in a project with a PHP frontend. We're pretty concerned about security, because we'll have quite a lot of users and are an attractive target for hackers. Our users are able to submit HTML formatted content that is visible to other users later. This is a big problem because we're vulnerable for the whole set of XSS attacks. We're filtering as good as we can, but the variety of attack vectors is pretty big.
So, I'm searching for PHP based HTML sanitizing/filtering solutions. Commercial solutions are fine (even preferred). Currently we're using a modified HTML purifier, but we're not satisfied with the results.
What are some good libraries/tools that are capable of filtering malicious parts of HTML?
It is nice to have for example HTML5 awareness, which will become a security nightmare once it's available "in the wild".
Update: We're doing an in-depth configuration of HTML Purifier. It looks like the older framework we used before was just not configuring it at all. Now the results look much better.
PHP filter_var() Function The filter_var() function filters a single variable with a specified filter. It takes two pieces of data: The variable you want to check. The type of check to use.
There are two main types of filtering: validation and sanitization.
PHP filters are used to validate and sanitize external input. The PHP filter extension has many of the functions needed for checking user input, and is designed to make data validation easier and quicker.
HTML Purifier is an HTML filtering solution that uses a unique combination of robust whitelists and aggressive parsing to ensure that not only are XSS attacks thwarted, but the resulting HTML is standards compliant.
Personally I have had very good results with the HTML Purifier project
It is highly customizable and has a huge code base. The only issue is uploading the files to your server.
Are you sure you have not got a configuration issue with your installation? As the purifier should not let through any HTML tags at all if configured correctly.
From the web site:
HTML Purifier is a standards-compliant HTML filter library written in PHP. HTML Purifier will not only remove all malicious code (better known as XSS) with a thoroughly audited,
secure yet permissive whitelist, it will also make sure your documents are standards compliant, something only achievable with a comprehensive knowledge of W3C's specifications.
Tired of using BBCode due to the current landscape of deficient or
insecure HTML filters? Have a
WYSIWYG editor but never been able to use it? Looking for high-quality, standards-compliant, open-source components for that application you're building? HTML Purifier is for you!
I wrote an article about how to use the HTML purifier library with CodeIgniter here.
Maybe it will help with giving it another try:
// load the config and overide defaults as necessary
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML', 'Doctype', 'XHTML 1.0 Transitional');
$config->set('HTML', 'AllowedElements', 'a,em,blockquote,p,strong,pre,code');
$config->set('HTML', 'AllowedAttributes', 'a.href,a.title');
$config->set('HTML', 'TidyLevel', 'light');
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With