I'm developing an international site which uses UTF8 to display non english characters. I'm also using friendly URLS which contain the item name. Obviously I can't use the non english characters in the URL.
Is there some sort of common practice for this conversion? I'm not sure which english characters i should be replacing them with. Some are quite obvious (like è to e) but other characters I am not familiar with (such as ß).
You can use UTF-8 encoded data in URL paths. You just need to encoded it additionally with the Percent encoding (see rawurlencode
):
// ß (U+00DF) = 0xC39F (UTF-8)
$str = "\xC3\x9F";
echo '<a href="http://en.wikipedia.org/wiki/'.rawurlencode($str).'">'.$str.'</a>';
This will echo a link to http://en.wikipedia.org/wiki/ß. Modern browsers will display the character ß
itself in the location bar instead of the percentage encoded representation of that character in UTF-8 (%C3%9F
).
If you don’t want to use UTF-8 but only ASCII characters, I suggest to use transliteration like Álvaro G. Vicario suggested.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With