My .NET ASMX webservice is accepting requests from a client I don't have direct control over. It's sending a request that looks like this:
POST /Service.asmx HTTP/1.1
Connection: Keep-Alive
Pragma: no-cache
Content-Length: 1382
Content-Type: text/xml
Accept: text/xml
Host: localhost
User-Agent: Borland SOAP 1.1
SOAPAction: "http://domain.com/InsertRecords"
<?xml version="1.0"?>
<SOAP-ENV:Envelope... <v>ÄLMÅ BÄCK</v></SOAP-ENV:Envelope>
In my WebMethod, the string ÄLMÅ BÄCK gets munged to ??LM?? B??CK -- typical encoding mess-up.
In my testing I've found that if I simply tweak the content-type header, all is well:
Content-Type: text/xml; charset=utf-8
Why is .NET choosing an encoding other than utf-8 when it's unspecified, and is there any way I can coerce this ASMX to use UTF-8 encoding?
The following code run before the web service handler is invoked resulted in the HTTP Request correctly decoded:
if (HttpContext.Current.Request.ContentType == "text/xml") {
HttpContext.Current.Request.ContentType = "text/xml; charset=UTF-8";
}
This feels a bit hacky, but I believe it'll work well for my circumstances. I'm still very interested in some background information on why this was an issue at all, and if there's a better way to pull this off (besides getting the client to be more explicit about the encoding).
IIS 7.5 has configuration options to help with this. (I don't know if earlier versions support this.) I had a similar issue where I have a web application that receives requests from a system that uses the upper 5 characters of the extended ASCII character set as meaningful delimiters. These were getting munged by the decoding that IIS was applying to incoming requests. I found some IIS config options to fix this issue.
First, in IIS Manager, select your website and open the .NET Globalization settings:
There are settings for the expected encodings for File, Requests, Response Headers, and Responses. There are dozens of encoding options to choose from:
This worked perfectly for my scenario because this particular site only received requests from a program that I wrote, so I controlled both ends. With an unknown audience, you're just hoping your users encode their requests appropriately. (But then, that's true even if you use the default encodings...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With