Would the following 2 canonical link tags be viewed by spiders as pointing to the same URL?
<link rel="canonical" href="http://www.example.com/ŷ" />
- encoded<link rel="canonical" href="http://www.example.com/ŷ" />
- unencoded
Building a valid URL By the same token, any code that generates or accepts UTF-8 input might treat URLs with UTF-8 characters as "valid", but would also need to translate those characters before sending them out to a web server. This process is called URL-encoding or percent-encoding.
All pages (including the canonical page) should contain a canonical tag to prevent any possible duplication. Even if there are no other versions of a page, then that page should still include a canonical tag that links to itself.
The canonical tag is a page-level meta tag that is placed in the HTML header of a webpage. It tells the search engines which URL is the canonical version of the page being displayed.
Adding a link rel=”canonical” element also helps to confirm that and encourages search engines to focus on that version. So, in short, upper or lower case does matter for URLs.
ŷ
is an HTML entity that represents the Unicode character with code point 375 in decimal notation. In hexadecimal it'd be 0x177 so we are talking about U+0177 which is ŷ
.
That means that both URLs are exactly the same if:
If the browser displays ŷ in both cases it's likely that character set is correct but you should make sure it is.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With