Decoding a special character in C#

Question

I am wondering how I could decode the special character â€¢ to HTML?

I have tried using System.Web.HttpUtility.HtmlDecode but not luck yet.

drf · Accepted Answer

The issue here is not HTML decoding, but rather that the text was encoded in one character set (e.g., windows-1252) and then encoded again as a second (UTF-8).

In UTF-8, • is decoded as E2 80 A2. When this byte sequence is read using windows-1252 encoding, E2 80 A2 encodes as â€¢. (Saved again as UTF-8 â€¢ becomes C3 A2 E2 82 AC C2 A2 20 54 65 73 74.)

If the file is a windows-1252-encoded file, the file can simply be read with the correct encoding (e.g., as an argument to a StreamReader constructor.):

new StreamReader(..., Encoding.GetEncoding("windows-1252"));

If the file was saved with an incorrect encoding, the encoding can be reversed in some cases. For instance, for the string sequence in your question, you can write:

string s = "â€¢"; // the string sequence that is not properly encoded
var b = Encoding.GetEncoding("windows-1252").GetBytes(s); // b = `E2 80 A2`
string c = Encoding.UTF8.GetString(b);  // c = `•`

Note that many common nonprinting characters are in the range U+2000 to U+2044 (Reference), such as "smart quotes", bullets, and dashes. Thus, the sequence â€?, where ? is any character, will typically signify this type of encoding error. This allows this type of error to be corrected more broadly:

static string CorrectText(string input)
{
    var winencoding = Encoding.GetEncoding("windows-1252");
    return Regex.Replace(input, "â€.",
        m => Encoding.UTF8.GetString(winencoding.GetBytes(m.Value)));
}

Calling this function with text malformed in this way will correct some (but not all) errors. For instance CorrectText("â€¢Testâ€“orâ€œ") will return the intended •Test–or“.

Decoding a special character in C#

Tags:

html

c#

user2388013

1 Answers

drf

Recent Activity

Donate For Us

Decoding a special character in C#

Tags:

html

c#

user2388013

1 Answers

drf

Related questions

Recent Activity

Donate For Us