Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to encode href attribute in HTML

What should be done against contents of href attribute: HTML or URL encoding?

<a href="???">link text</a>

On the one hand, since href attribute contains URL I should use URL encoding. On the other hand, I'm inserting this URL into HTML, so it must be HTML encoded.

Please help me to overcome this contradiction.

Thanks.


EDIT:

Here's the contradiction. Suppose there might be the '<' and '>' characters in the URL. URL encoding won't escape them, so there will be reserved HTML characters inside the href attribute, which violates the standard. HTML encoding will escape '<' and '>' characters and HTML will be valid, but after that there will be unexpected '&' characters in the URL (this is reserved character for URL, it's used as a delimiter of query string parameters).

Reserved URL characters forms a superset of reserved HTML characters except for the '<' and '>' that are reserved for HTML but not for URL.


EDIT 2:

I was wrong about '<' and '>' characters, they are actually percent escaped by URL encoding. If so, URL encoding is sufficient in this case, isn't it?

like image 611
Maksim Tyutmanov Avatar asked Apr 17 '12 10:04

Maksim Tyutmanov


People also ask

How do I encode a URL in href?

URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits. URLs cannot contain spaces. URL encoding normally replaces a space with a plus (+) sign or with %20.

What is %20 encoded?

A space is assigned number 32, which is 20 in hexadecimal. When you see “%20,” it represents a space in an encoded URL, for example, http://www.example.com/products%20and%20services.html.

Why is a URL encoded in HTML?

URL encoding converts non-ASCII characters into a format that can be transmitted over the Internet. URL encoding replaces non-ASCII characters with a "%" followed by hexadecimal digits. URLs cannot contain spaces.


1 Answers

Construct a URL as normal. Follow the rules for constructing URLs. Encode data you put into it.

Then construct HTML as normal. Follow the rules for constructing HTML. Encode data as you put it into it.

i.e. Do both (but in the right order).

They aren't mutually exclusive, so there is no contradiction.

For example (this is a simplified example that assumes data in $_GET is correct and exists, don't do that in the real world):

$search_term = $_GET['q'];
$page = $_GET['page'];
$next_page = $page + 1;
$next_page_url = 'http://example.com/search?q=' . urlencode($search_term) . '&page=' . urlencode($next_page);
$html = '<a href="' . htmlspecialchars($next_page_url) . '">link text</a>';
like image 112
Quentin Avatar answered Oct 07 '22 22:10

Quentin