Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

URL + htmlentities ? what to think about this?

I asked a question in an another forum related to urlencode(), when a guy appeared saying briskly it is absolutly required to make use of htmlentities on top of it, and also suggesting I should do that everytime I write an URL. "To be valid and secure he said". I do not see why it could be a security issue. Here is the code he mentionned :

echo '<a href="index.php?' . htmlentities('page=encode&code='.urlencode($code).'&login='.urlencode($login).'&codeconf=' . urlencode($codeconf)) . '">';

Php Manual indeed mentions htmlentities. But there are no further explanations:

Note ... PHP supports changing the argument separator to the W3C-suggested semi-colon through the arg_separator .ini directive. Unfortunately most user agents do not send form data in this semi-colon separated format. A more portable way around this is to use & instead of & as the separator. You don't need to change PHP's arg_separator for this. Leave it as &, but simply encode your URLs using htmlentities() or htmlspecialchars().

I replaced "&" by &amp ; , I validated my page in W3C validator and it came out OK.

I am still concerned about this htmlentities issue.

  1. Is there any good reason why whe should use htmlentities in URL's ?
  2. If yes, does this apply to all types of URL's ?
  3. If yes, is it for security reasons ?
like image 889
SunnyOne Avatar asked Oct 16 '12 05:10

SunnyOne


1 Answers

This is not about HTML entities in URLs. This is about you putting arbitrary data into HTML, which means you need to HTML escape any special characters in it. That this data happens to be a URL is irrelevant.

  1. You need to escape any arbitrary data you put into the URL with urlencode to preserve characters with a special meaning in the URL.
  2. The arbitrary blob of data you get from step one needs to be HTML escaped for the same reasons when put into HTML. As you see in your example, there's an & in your data which is required to be escaped to &amp; by HTML rules.

If you did not use the URL in an HTML context, there'd be no need to HTML escape it. HTML entities have no place in a URL. A URL in an HTML context must be HTML escaped though, like any other data.

like image 119
deceze Avatar answered Sep 28 '22 04:09

deceze