I've been using Apache's StringEscapeUtils
for HTML entities, but if you want to escape HTML attribute values, is there a standard way to do this? I guess that using the escapeHtml
function won't cut it, since otherwise why would the Owasp
Encoder interface have two different methods to cope with this?
Does anyone know what is involved in escaping HTML attributes vs. entities and what to do about attribute encoding in the case that you don't have the Owasp library to hand?
It looks like this is Rule #2 of the Owasp's XSS Prevention Cheat Sheet. Note the bit where is says:
Properly quoted attributes can only be escaped with the corresponding quote
Therefore, I guess so long as the attributes are correctly bounded with double or single quotes and you escape these (i.e. double quote (") becomes " and single quote (') becomes ' (or ')) then you should be ok. Note that Apache's StringEscapeUtils.escapeHtml
will be insufficient for this task since it does not escape the single quote ('); you should use the String's replaceAll
method to do this.
Otherwise, if the attribute is written: <div attr=some_value>
then you need to follow the recommendation on that page and..
escape all characters with ASCII values less than 256 with the &#xHH; format (or a named entity if available) to prevent switching out of the attribute
Not sure if there a non-Owasp standard implementation of this though. However, it guess it's good practice not to write attributes in this manner anyway!
Note that this is only valid when you are putting in a standard attribute values, if the attribute is a href
or some JavaScript handler, then it's a different story. For examples of possible XSS scripting attacks that can occur from unsafe code inside event handler attributes see: http://ha.ckers.org/xss.html.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With