Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any logic behind ASCII codes' ordering?

Tags:

char

ascii

I was teaching C to my younger brother studying engineering. I was explaining him how different data-types are actually stored in the memory. I explained him the logistics behind having signed/unsigned numbers and floating point bit in decimal numbers. While I was telling him about char type in C, I also took him through the ASCII code system and also how char is also stored as 1 byte number.

He asked me why 'A' has been given ASCII code 65 and not anything else? Similarly why 'a' is given the code 97 specifically? Why is there a gap of 6 ASCII codes between the range of capital letters and small letters? I had no idea of this. Can you help me understand this, since this has created a great curiosity to me as well. I've never found any book so far that has discussed this topic.

What is the reason behind this? Are ASCII codes logically organized?

like image 316
this. __curious_geek Avatar asked Jul 16 '09 08:07

this. __curious_geek


People also ask

Why is ASCII in that order?

[14] The "space" character had to come before graphics to make sorting algorithms easy, so it became position 0x20. [15] The committee decided it was important to support upper case 64-character alphabets, and chose to structure ASCII so it could easily be reduced to a usable 64-character set of graphic codes.

What is ASCII code order?

ASCII Order The main features of the ASCII sequence are that digits are sorted before uppercase letters, and uppercase letters are sorted before lowercase letters. The blank is the smallest displayable character.

Why do ASCII letters start at 65?

ASCII is a common encoding stan- dard, which computers use in order to store text-based data. In the standard, the number 65 corresponds to the capital letter 'A'. Thus, if a computer wanted to store the capital letter 'A', it would need to store the number 65 in binary (which happens to be 1000001).

What is the main problem with ASCII?

The problem with ASCII or extended ASCII is that the ASCII system can only represent up to 128 (or 256 for EASCII) different characters. The limitation on the number of character sets means representing character sets for several different language structures is not possible.


2 Answers

There are historical reasons, mainly to make ASCII codes easy to convert:

Digits (0x30 to 0x39) have the binary prefix 110000:

0 is 110000 1 is 110001 2 is 110010 

etc. So if you wipe out the prefix (the first two '1's), you end up with the digit in binary coded decimal.

Capital letters have the binary prefix 1000000:

A is 1000001 B is 1000010 C is 1000011 

etc. Same thing, if you remove the prefix (the first '1'), you end up with alphabet-indexed characters (A is 1, Z is 26, etc).

Lowercase letters have the binary prefix 1100000:

a is 1100001 b is 1100010 c is 1100011 

etc. Same as above. So if you add 32 (100000) to a capital letter, you have the lowercase version.

like image 196
FWH Avatar answered Oct 05 '22 17:10

FWH


This chart shows it quite well from wikipedia: Notice the two columns of control 2 of upper 2 of lower, and then gaps filled in with misc. ASCII Chart on Wikipedia

Also bear in mind that ASCII was developed based on what had passed before. For more detail on the history of ASCII, see this superb article by Tom Jennings, which also includes the meaning and usage of some of the stranger control characters.

like image 37
Mesh Avatar answered Oct 05 '22 18:10

Mesh