I have a URL which requires some parameters. The values of those parameters can be accented characters, so I absolutely need to UrlEncode them. Strangely, I see a difference between the behavior or Javascript and .NET.
Let's pretend I try to UrlEncode the word "éléphant". In JavaScript (according to this WebSite: http://www.albionresearch.com/misc/urlencode.php), I get the following: %E9l%E9phant. This appears correct to me. However, in .NET with this call (System.Web.HttpUtility.UrlEncode("éléphant")) I get "%c3%a9l%c3%a9phant". What is wrong? What am I missing? What should I do if I want to get %E9l%E9phant in .NET?
Thanks!
Why do we need to encode? URLs can only have certain characters from the standard 128 character ASCII set. Reserved characters that do not belong to this set must be encoded. This means that we need to encode these characters when passing into a URL.
The UrlEncode(String) method can be used to encode the entire URL, including query-string values. If characters such as blanks and punctuation are passed in an HTTP stream without encoding, they might be misinterpreted at the receiving end.
The encodeURIComponent() function encodes a URI by replacing each instance of certain characters by one, two, three, or four escape sequences representing the UTF-8 encoding of the character (will only be four escape sequences for characters composed of two "surrogate" characters).
The difference between encodeURI and encodeURIComponent is encodeURIComponent encodes the entire string, where encodeURI ignores protocol prefix ('http://') and domain name. encodeURIComponent is designed to encode everything, where encodeURI ignores a URL's domain related roots.
System.Web.HttpUtility.UrlEncode will use UTF8 (i think..) as its default encoder, you can change this by specifying one..
System.Web.HttpUtility.UrlEncode("éléphant", Encoding.Default); // %e9l%e9phant
Though it may be preferable to specify an actual codepage or what not, instead of relying on the OS default.
In JavaScript (according to this WebSite: http://www.albionresearch.com/misc/urlencode.php), I get the following: %E9l%E9phant.
That page is wrong. JavaScript also uses UTF-8, as .NET does by default. Try it yourself:
javascript:alert(encodeURIComponent('éléphant'))
%C3%A9l%C3%A9phant
URLs today are UTF-8. Don't try to use cp1252 any more. UTF-8 is your friend. Trust UTF-8!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With