Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do I get a different value after turning an integer into ASCII and then back to an integer?

Tags:

c#

ascii

Why, when I turn INT value to bytes and to ASCII and back, I get another value?

Example:

var asciiStr = new string(Encoding.ASCII.GetChars(BitConverter.GetBytes(2000)));
var intVal = BitConverter.ToInt32(Encoding.ASCII.GetBytes(asciiStr), 0);
Console.WriteLine(intVal);

// Result: 1855
like image 678
ITboy Avatar asked Oct 08 '20 13:10

ITboy


People also ask

Which of the following converts an integer value to its ASCII equivalent?

C. C++ code: Here int() is used to convert character to its ASCII value.

How to add ASCII value to a character in java?

Very simple. Just cast your char as an int . char character = 'a'; int ascii = (int) character; In your case, you need to get the specific Character from the String first and then cast it.

How do you convert int to char in Python?

To convert int to char in Python, use the chr() method. The chr() is a built-in Python method that returns a character (a string) from an integer (it represents the Unicode code point of the character).


2 Answers

ASCII is only 7-bit - code points above 127 are unsupported. Unsupported characters are converted to ? per the docs on Encoding.ASCII:

The ASCIIEncoding object that is returned by this property might not have the appropriate behavior for your app. It uses replacement fallback to replace each string that it cannot encode and each byte that it cannot decode with a question mark ("?") character.

So 2000 decimal = D0 07 00 00 hexadecimal (little endian) = [unsupported character] [BEL character] [NUL character] [NUL character] = ? [BEL character] [NUL character] [NUL character] = 3F 07 00 00 hexadecimal (little endian) = 1855 decimal.

like image 118
Joe Sewell Avatar answered Oct 19 '22 09:10

Joe Sewell


TL;DR: Everything's fine. But you're a victim of character replacement.

We start with 2000. Let's acknowledge, first, that this number can be represented in hexadecimal as 0x000007d0.

BitConverter.GetBytes

BitConverter.GetBytes(2000) is an array of 4 bytes, Because 2000 is a 32-bit integer literal. So the 32-bit integer representation, in little endian (least significant byte first), is given by the following byte sequence { 0xd0, 0x07, 0x00, 0x00 }. In decimal, those same bytes are { 208, 7, 0, 0 }

Encoding.ASCII.GetChars

Uh oh! Problem. Here's where things likely took an unexpected turn for you.

You're asking the system to interpret those bytes as ASCII-encoded data. The problem is that ASCII uses codes from 0-127. The byte with value 208 (0xd0) doesn't correspond to any character encodable by ASCII. So what actually happens?

When decoding ASCII, if it encounters a byte that is out of the range 0-127 then it decodes that byte to a replacement character and moves to the next byte. This replacement character is a question mark ?. So the 4 chars you get back from Encoding.ASCII.GetChars are ?, BEL (bell), NUL (null) and NUL (null).

BEL is the ASCII name of the character with code 7, which traditionally elicits a beep when presented on a capable terminal. NUL (code 0) is a null character traditionally used for representing the end of a string.

new string

Now you create a string from that array of chars. In C# a string is perfectly capable of representing a NUL character within the body of a string, so your string will have two NUL chars in it. They can be represented in C# string literals with "\0", in case you want to try that yourself. A C# string literal that represents the string you have would be "?\a\0\0" Did you know that the BEL character can be represented with the escape sequence \a? Many people don't.

Encoding.ASCII.GetBytes

Now you begin the reverse journey. Your string is comprised entirely of characters in the ASCII range. The encoding of a question mark is code 63 (0x3F). And the BEL is 7, and the NUL is 0. so the bytes are { 0x3f, 0x07, 0x00, 0x00 }. Surprised? Well, you're encoding a question mark now where before you provided a 208 (0xd0) byte that was not representable with ASCII encoding.

BitConverter.ToInt32

Converting these four bytes back to a 32-bit integer gives the integer 0x0000073f, which, in decimal, is 1855.

like image 37
Wyck Avatar answered Oct 19 '22 09:10

Wyck