When encoding possibly unsafe data, is there a reason to encode >
?
attr="data"
, attr='data'
, <tag>data</tag>
)I think the reasons somebody would do this are
<[^>]+>?
(rare)attr=data
. :-o (not happening!)Am I missing anything?
> stands for the greater-than sign: > ≤ stands for the less-than or equals sign: ≤ ≥ stands for the greater-than or equals sign: ≥
Any time you are trying to output data that could include untrusted html, you should use HTMLENCODE . Encodes text and merge field values for use in HTML by replacing characters that are reserved in HTML, such as the greater-than sign ( > ), with HTML entity equivalents, such as > .
HTMLEncoding turns this character into "<" which is the encoded representation of the less-than sign. URLEncoding does the same, but for URLs, for which the special characters are different, although there is some overlap.
Strictly speaking, to prevent HTML injection, you need only encode <
as <
.
If user input is going to be put in an attribute, also encode "
as "
.
If you're doing things right and using properly quoted attributes, you don't need to worry about >
. However, if you're not certain of this you should encode it just for peace of mind - it won't do any harm.
The HTML4 specification in its section 5.3.2 says that
authors should use "
>
" (ASCII decimal 62) in text instead of ">"
so I believe you should encode the greater >
sign as >
(because you should obey the standards).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With