Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Casting non-numeric char to int?

Tags:

c#

I came across this line of code today:

int c = (int)'c';

I was not aware you could cast a char to an int. So I tested it out, and found that a=97, b=98, c=99, d=100 etc etc...

Why is 'a' 97? What do those numbers relate to?

like image 586
Mike Baxter Avatar asked Dec 06 '22 08:12

Mike Baxter


2 Answers

Everyone else (so far) has referred to ASCII. That's a very limited view - it works for 'a', but doesn't work for anything with an accent etc - which can very easily be represented by char.

A char is just an unsigned 16-bit integer, which is a UTF-16 code unit. Usually that's equivalent to a Unicode character, but not always - sometimes multiple code units are required for a single full character. See the documentation for System.Char for more details.

The implicit conversion from char to int (you don't need the cast in your code) just converts that 16-bit unsigned integer to a 32-bit signed integer in the natural, non-lossy way - just as if you had a ushort.

Note that every valid character in ASCII has the same value in UTF-16, which is why the two are often confused when the examples are only ones from the ASCII set.

like image 200
Jon Skeet Avatar answered Dec 22 '22 13:12

Jon Skeet


97 is UTF-16 code unit value of letter a.

Basically this number relates to UTF-16 code unit of given character.

like image 22
Zbigniew Avatar answered Dec 22 '22 13:12

Zbigniew