Why, when I turn INT value to bytes and to ASCII and back, I get another value?
Example:
var asciiStr = new string(Encoding.ASCII.GetChars(BitConverter.GetBytes(2000)));
var intVal = BitConverter.ToInt32(Encoding.ASCII.GetBytes(asciiStr), 0);
Console.WriteLine(intVal);
// Result: 1855
C. C++ code: Here int() is used to convert character to its ASCII value.
Very simple. Just cast your char as an int . char character = 'a'; int ascii = (int) character; In your case, you need to get the specific Character from the String first and then cast it.
To convert int to char in Python, use the chr() method. The chr() is a built-in Python method that returns a character (a string) from an integer (it represents the Unicode code point of the character).
ASCII is only 7-bit - code points above 127
are unsupported. Unsupported characters are converted to ?
per the docs on Encoding.ASCII
:
The ASCIIEncoding object that is returned by this property might not have the appropriate behavior for your app. It uses replacement fallback to replace each string that it cannot encode and each byte that it cannot decode with a question mark ("?") character.
So 2000
decimal = D0 07 00 00
hexadecimal (little endian) = [unsupported character] [BEL character] [NUL character] [NUL character]
= ? [BEL character] [NUL character] [NUL character]
= 3F 07 00 00
hexadecimal (little endian) = 1855
decimal.
TL;DR: Everything's fine. But you're a victim of character replacement.
We start with 2000
. Let's acknowledge, first, that this number can be represented in hexadecimal as 0x000007d0
.
BitConverter.GetBytes(2000)
is an array of 4 bytes, Because 2000 is a 32-bit integer literal. So the 32-bit integer representation, in little endian (least significant byte first), is given by the following byte sequence { 0xd0, 0x07, 0x00, 0x00 }
. In decimal, those same bytes are { 208, 7, 0, 0 }
Uh oh! Problem. Here's where things likely took an unexpected turn for you.
You're asking the system to interpret those bytes as ASCII-encoded data. The problem is that ASCII uses codes from 0-127. The byte with value 208 (0xd0
) doesn't correspond to any character encodable by ASCII. So what actually happens?
When decoding ASCII, if it encounters a byte that is out of the range 0-127 then it decodes that byte to a replacement character and moves to the next byte. This replacement character is a question mark ?
. So the 4 chars you get back from Encoding.ASCII.GetChars are ?
, BEL (bell), NUL (null) and NUL (null).
BEL
is the ASCII name of the character with code 7, which traditionally elicits a beep when presented on a capable terminal. NUL (code 0) is a null character traditionally used for representing the end of a string.
Now you create a string from that array of chars. In C# a string is perfectly capable of representing a NUL character within the body of a string, so your string will have two NUL chars in it. They can be represented in C# string literals with "\0"
, in case you want to try that yourself. A C# string literal that represents the string you have would be "?\a\0\0"
Did you know that the BEL character can be represented with the escape sequence \a
? Many people don't.
Now you begin the reverse journey. Your string is comprised entirely of characters in the ASCII range. The encoding of a question mark is code 63 (0x3F). And the BEL is 7, and the NUL is 0. so the bytes are { 0x3f, 0x07, 0x00, 0x00 }
. Surprised? Well, you're encoding a question mark now where before you provided a 208 (0xd0) byte that was not representable with ASCII encoding.
Converting these four bytes back to a 32-bit integer gives the integer 0x0000073f
, which, in decimal, is 1855
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With