A weird thing in c# Encoding

Question

I convert a byte array to a string , and I convert this string to byte array. these two byte arrays are different.

As below:

byte[] tmp = Encoding.ASCII.GetBytes(Encoding.ASCII.GetString(b));

Suppose b is a byte array.

b[0]=3, b[1]=188, b[2]=2 //decimal system

Result:

tmp[0]=3, tmp[1]=63, tmp[2]=2

So that's my problem, what's wrong with it?

Rowland Shaw · Accepted Answer

188 is out of range for ASCII. Characters that are not in the corresponding character set are transposed to '?' by design (would you prefer transposing to "1/4"?)

Alvin Wong · Answer

ASCII is 7-bit only, so others are invalid. By default it uses ? to replace any invalid bytes and that's why you get a ?.

For 8-bit character sets, you should be looking for either the Extended ASCII (which is later defined "ISO 8859-1") or the code page 437 (which is often confused with Extended ASCII, but in fact it's not).

You can use the following code:

Encoding enc = Encoding.GetEncoding("iso-8859-1");
// For CP437, use Encoding.GetEncoding(437)
byte[] tmp = enc.GetBytes(enc.GetString(b));

A weird thing in c# Encoding

Tags:

c#

encoding

roast_soul

2 Answers

Rowland Shaw

Alvin Wong

Recent Activity

Donate For Us

A weird thing in c# Encoding

Tags:

c#

encoding

roast_soul

2 Answers

Rowland Shaw

Alvin Wong

Related questions

Recent Activity

Donate For Us