Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Encoding char in C# [duplicate]

Possible Duplicate:
To which character encoding (Unicode version) set does a char object correspond?

I'm a little afraid to ask this, as I'm sure its been asked before, but I can't find it. Its probably something obvious, but I've never studied encoding before.

int Convert(char c)
{
    return (int)c;
}

What encoding is produced by that method? I thought it might be ASCII (at least for <128), but doing the code below produced... smiley faces as the first characters? What? Definitely not ASCII...

for (int i = 0; i < 128; i++)
    Console.WriteLine(i + ": " + (char)i);

1 Answers

C# char uses the UTF-16 encoding. The language specification, 1.3 Types and variables, says:

Character and string processing in C# uses Unicode encoding. The char type represents a UTF-16 code unit, and the string type represents a sequence of UTF-16 code units.

UTF-16 overlaps with ASCII in that the character codes in the ASCII range 0-127 mean the same thing in UTF-16 as in ASCII. The smiley faces in your program's output are presumably how your console interprets the non-printable characters in the range 0-31.

like image 101
David Heffernan Avatar answered Apr 30 '26 18:04

David Heffernan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!