I have a JavaScript request going to a ASP.Net (2.0) HTTP handler which passes the request to a java web service. In this system special characters, such as those with an accent do not get passed on correctly.
E.G.
Düsseldorf
http://site/serviceproxy.ashx?q=D%FCsseldorf, which is valid in ISO-8859-1 as well as in UTF-8 as far as I can tell. (unless it's %c3%bc in UTF-8)HttpContext.Current.Request.QueryString.Get("q") returns D�sseldorf which is where trouble begins.HttpUtility.UrlEncode(HttpContext.Current.Request.QueryString.Get("q"), Encoding.GetEncoding("ISO-8859-1")) returns D%3fsseldorf (a '?')HttpUtility.UrlEncode(HttpContext.Current.Request.QueryString.Get("q"), Encoding.UTF8) returns D%ef%bfsseldorf
So it the value doesn't get decoded nor re-encoded correctly to be passed on to the java service.
HttpContext.Current.Request.Url.Query is ?q=D%FCsseldorf&output=json&from=1&to=10
HttpContext.Current.Request.QueryString.ToString() is q=D%ufffdsseldorf&output=json&from=1&to=10
Why is this, and how can I tell the HttpContext to honor the request headers which include:
Content-Type=application/x-www-form-urlencoded;+charset=UTF-8
and decode the URL's QueryString using the UTF-8 charset.
Addendum: As the answer notes, the trouble lies not so much in the decoding as the encoding; using escape() in JavaScript does not escape according to UTF-8, while using encodeURIComponent() does.
I don't know what the default character encoding used by your server (IIS?) is, or if it can be changed, but I can tell you a few things that might help.
0xFC is the ISO-8859-1 encoding for ü. While the Unicode code point is U+00FC, when encoded with UTF-8, this requires two bytes, and becomes 0xC3 0xBC.
If a UTF-8 decoder were to see the illegal byte sequence 0xFC, it would decode it as a Unicode "replacement character", U+FFFD, and pick up where it saw the beginning of another valid byte sequence, in this case 's'.
The reason you get %3f is that '?' is the "replacement character" for the Latin character set, similar to � in the Unicode character set.
I believe what you're seeing is the client encoding with ISO-8859-1, but the server is decoding with UTF-8. As soon as it hits the server, your data is corrupted. I recommend that you modify the client to use UTF-8 encoding; it should be requesting http://site/serviceproxy.ashx?q=D%C3%BCsseldorf 
It sounds like you are constructing these URLs from JavaScript, so you should use the encodeURI and encodeURIComponent functions, not escape.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With