I am trying to get updated on available and appropriate counter-measures that actively reduces the chance of being hit by the XSS train during 2011.
I've googled like never before to find out that there's plenty of libraries available online that's supposed to help out with XSS issues, which proudly and boldly states that "the XSS/SQL injection buck stops here".
I have found that these libraries suffer from at least one of the two following symptoms:
PHP has been around for some time now and the far-from decent strip_tags
is accompanied by functions such as filter_var
, among others. I am far from an expert in these security issues and really can't tell whether it will ensure good nights of sleep in the future, or not.
What is my best chance of reducing XSS injections during 2011 without bloating my code, with or without dated libraries ?
The best solution is to use a template engine that will not let you output any data as plain HTML unless you explicitly tell it to. If you have to escape things manually, it's way too easy to forget something and leave yourself open to an XSS attack.
If you're stuck without a real template entine, use htmlspecialchars().
I couldn't agree more with Matti Virkkunen or with what I believe is implied by Matti's answer so let me say it loud and clear: nothing will "remove all malicious code". You can never know how the data is going to be used in other parts of the application or in the future. You can "purify" it for SQL but you shouldn't put anything unescaped in SQL in the first place. You can "purify" it for HTML but you should never include any data unescaped in HTML. You can "purify" it for inclusion in the parameter to awk in the shell script but ... you get the idea.
Even the halting problem is undecidable, much less the malicious intents of any given code or data. Any methods of input "purification" are useless in the long run. What is needed is the correct escaping of data. Always. So if you want to include anything in a SQL query, include it as data and not as code. If you want to print a username in a blog post, include it as text and not as HTML. No one will ever be harmed by seeing a comment from "Mr. <script>alert('XSS');</script>" if the username is HTML-encoded or dynamically added to the DOM as a text node.
All of the automatic purification tools are nothing more than a magic dust to add to your program to make it secure. They say: "Here - we have made all of your data kosher so you can now use it insecurely and not bother about data boundaries!" This only leads to a false sense of security. We as developers need to take responsibility for the output of our data and never assume that everything is great because we got a tool to make all of our data "safe" whatever that means at the input.
I recommend HTMLPurifier for user-submitted data:
HTML Purifier is a standards-compliant HTML filter library written in PHP. HTML Purifier will not only remove all malicious code (better known as XSS) with a thoroughly audited, secure yet permissive whitelist, it will also make sure your documents are standards compliant, something only achievable with a comprehensive knowledge of W3C's specifications.
Rule 3b of the essential security rules is pretty much all there is to it. Stick to converting user-input consistently before you output it, and you're safe.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With