Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a literal improperly encoded string (e.g., "ñ") to ISO-8859-1 (Latin1) H

Tags:

c#

.net

encoding

Without going into too much detail, I have a C# WCF application that is a wrapper for an XML based API I am calling. That API returns a string, which is really just an XML document. I then parse that XML, and return it. That returned information is displayed in the browser as JSON.

A bit confusing, but here is some sampled code:

[OperationContract]
[WebInvoke(Method = "GET", BodyStyle = WebMessageBodyStyle.Bare,
    ResponseFormat = WebMessageFormat.Json, UriTemplate = "/TestGetUser")]
TestGetUserResponse TestGetUser();

/* ... */

[DataContract(Namespace = "http://schema.mytestdomain/", Name = "TestGetUser")]
public class TestGetUserResponse
{
    [DataMember]
    public User User { get; set; }
    [DataMember]
    public Error Error { get; set; }
}

And TestGetUser being:

public TestGetUserResponse TestGetUser() {
    WebClient client = getCredentials(); // getCredentials() method is defined elsewhere

    string apiUrl = "http://my.api.url.com/API";
    string apiRequest = "<?xml version='1.0' encoding='utf-8' ?><test>My XML Request Lives Here</test>";
    
    string result = client.UploadString(apiUrl, apiRequest);
    
    XmlDocument user = new XmlDocument();
    user.LoadXml(result);
    
    userNode = user.SelectSingleNode("/my[1]/xpath[1]/user[1]");
    
    return new TestGetUserResponse {
        Error = new Error(),
        User = new User {
            Name = userNode.SelectSingleNode("name[1]").InnerText,
            Email = userNode.SelectSingleNode("email[1]").InnerText,
            ID = System.Convert.ToInt32(userNode.SelectSingleNode("id[1]").InnerText)
        }
    };
}

So, when I hit my URL from a browser, it returns a JSON string, like below:

{
    "Error": {
        "ErrorCode": 0,
        "ErrorDetail": null,
        "ErrorMessage":"Success"
    },
    "User": {
        "Name": "John Smith",
        "Email": "[email protected]",
        "ID": 12345
    }
}

Now, my problem is, sometimes the string that is returned (directly from the API) is a badly encoded UTF-8 string (I think? I could be getting this a bit wrong). For example, I may get back:

{
    "Error": {
        "ErrorCode": 0,
        "ErrorDetail": null,
        "ErrorMessage":"Success"
    },
    "User": {
        "Name": "Jose Nuñez",
        "Email": "[email protected]",
        "ID": 54321
    }
}

Notice the ñ in the Name property under the User object.

My question is, how can I convert this improperly encoded string to a ñ, which is what it should be?

I've found a bunch of posts

  • Strange Characters in database text: Ã, Ã, ¢, â‚ €,
  • How to convert these strange characters? (ë, Ã, ì, ù, Ã)
  • C# UTF8 Decoding, returning bytes/numbers instead of string
  • How to Decode "=?utf-8?B?...?=" to string in C#
  • How to convert (transliterate) a string from utf8 to ASCII (single byte) in c#?
  • MOST PROMISING C# Convert string from UTF-8 to ISO-8859-1 (Latin1) H

But none seem to be exactly what I need, or trying to borrow from those posts have failed.

So, to make my question as simple as possible,

If I have a variable in a C# (.NET 3.5) application that when I write it out to the screen get's written as 'ñ', how can I "re-encode" (may be wrong word) so that it outputs as 'ñ'?

Thanks in advance.

like image 262
romellem Avatar asked Aug 04 '15 20:08

romellem


1 Answers

Ideally this would be fixed in the api you are calling so it is returning the expected encoding. But you should be able to fix it this way:

byte[] bytes = Encoding.GetEncoding(1252).GetBytes(Name);
var nameFixed = Encoding.UTF8.GetString(bytes);
like image 164
Kevin Avatar answered Oct 28 '22 23:10

Kevin