Quote from C++03 2.2 Character sets:
"The basic execution character set and the basic execution wide-character set shall each contain all the members of the basic source character set..The values of the members of the execution character sets are implementation-defined, and any additional members are locale-specific."
According to this, 'A'
, which belongs to the execution character set, its value is implementation-defined. So it's not 65(ASCII code of 'A'
in decimal), what?!
// Not always 65?
printf ("%d", 'A');
Or I've a misunderstanding as to the value of a character in execution character set?
The execution character set is the encoding used for the text of your program that is input to the compilation phase after all preprocessing steps. This character set is used for the internal representation of any string or character literals in the compiled code.
A character set is the key component behind displaying, manipulating and editing text, numbers and symbols on a computer. A character set is created through a process known as encoding i.e. each character is assigned with a unique code or value.
Character sets used today in the US are generally 8-bit sets with 256 different characters, effectively doubling the ASCII set.
A character set is made up of a series of code points, or the numeric representation of a character. For example, the code point for the letter A in international EBCDIC is 0xC1. A character set can also be called a coded character set, a code set, a code page, or an encoding.
Of course it can be ASCII's 65, if the execution character set is ASCII or a superset (such as UTF-8).
It doesn't say "it can't be ASCII", it says that it is something called "the execution character set".
So, the standard allows that the "execution character set" is other things than ASCII or ASCII derivatives. One example would be the EBCDIC character set that IBM used for a long time (there's probably still machines about using EBCDIC, but I suspect anything built in the last 10-15 years wouldn't be using that). The encoding of characters in EBCDIC is completely different from ASCII.
So, expecting, in code, that the value of 'A' is any particular value is not portable. There are also a whole heap of other "common assumptions" that will fail - that there are no "holes" between A-Z, and that 'A'-'a' == 32 are both false in EBCDIC. At least the characters A-Z are in the correct order! ;)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With