If the following statements are true,
Content-Type: text/html; charset=UTF-8
.<script>
tags in the document.are there any cases where htmlspecialchars($input, ENT_QUOTES, 'UTF-8')
(converting &
, "
, '
, <
, >
to the corresponding named HTML entities) is not enough to protect against cross-site scripting when generating HTML on a web server?
Using htmlspecialchars() function – The htmlspecialchars() function converts special characters to HTML entities. For a majority of web-apps, we can use this method and this is one of the most popular methods to prevent XSS. This process is also known as HTML Escaping.
The htmlspecialchars() function is used to converts special characters ( e.g. & (ampersand), " (double quote), ' (single quote), < (less than), > (greater than)) to HTML entities ( i.e. & (ampersand) becomes &, ' (single quote) becomes ', < (less than) becomes < (greater than) becomes > ).
In answer to your question, you should use htmlentities() when outputting any content that could contain user input or special characters. Show activity on this post. htmlspecialchars() is more than enough. htmlentities is for different use, not preventing XSS.
Cross site scripting, or XSS, is a form of attack on a web application which involves executing code on a user's browser. Output encoding is a defense against XSS attacks.
htmlspecialchars()
is enough to prevent document-creation-time HTML injection with the limitations you state (ie no injection into tag content/unquoted attribute).
However there are other kinds of injection that can lead to XSS and:
There are no <script> tags in the document.
this condition doesn't cover all cases of JS injection. You might for example have an event handler attribute (requires JS-escaping inside HTML-escaping):
<div onmouseover="alert('<?php echo htmlspecialchars($xss) ?>')"> // bad!
or, even worse, a javascript: link (requires JS-escaping inside URL-escaping inside HTML-escaping):
<a href="javascript:alert('<?php echo htmlspecialchars($xss) ?>')"> // bad!
It is usually best to avoid these constructs anyway, but especially when templating. Writing <?php echo htmlspecialchars(urlencode(json_encode($something))) ?>
is quite tedious.
And... injection issues can happen on the client-side as well (DOM XSS); htmlspecialchars()
won't protect you against a piece of JavaScript writing to innerHTML
(commonly .html()
in poor jQuery scripts) without explicit escaping.
And... XSS has a wider range of causes than just injections. Other common causes are:
allowing the user to create links, without checking for known-good URL schemes (javascript:
is the most well-known harmful scheme but there are more)
deliberately allowing the user to create markup, either directly or through light-markup schemes (like bbcode which is invariably exploitable)
allowing the user to upload files (which can through various means be reinterpreted as HTML or XML)
Assuming you are not using older PHP versions (5.2 or so), the htmlspecialchars is "safe" (and off course taking the backend code into consideration as @Royal Bg mentions)
In older PHP versions malformed UTF-8 characters made this function vulnerable
My 2 cents: just always sanitize/check your inputs by telling what is allowed, instead of just escaping everything/encoding everything
i.e. if someone must enter a telephone number, i can imagine the following characters are allowed: 0123456789()+-. and a space, but all others are just ignored / stripped out
Same would apply to addresses etc. someone specifying UTF-8 characters for dots/blocks/hearts etc. in their address must be mentally ill...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With