Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the value of characters in execution character set?

Tags:

c++

Quote from C++03 2.2 Character sets:

"The basic execution character set and the basic execution wide-character set shall each contain all the members of the basic source character set..The values of the members of the execution character sets are implementation-defined, and any additional members are locale-specific."

According to this, 'A', which belongs to the execution character set, its value is implementation-defined. So it's not 65(ASCII code of 'A' in decimal), what?!

// Not always 65?
printf ("%d", 'A');

Or I've a misunderstanding as to the value of a character in execution character set?

like image 882
Eric Z Avatar asked May 02 '13 13:05

Eric Z


People also ask

What is execution character set?

The execution character set is the encoding used for the text of your program that is input to the compilation phase after all preprocessing steps. This character set is used for the internal representation of any string or character literals in the compiled code.

What is the purpose of character set?

A character set is the key component behind displaying, manipulating and editing text, numbers and symbols on a computer. A character set is created through a process known as encoding i.e. each character is assigned with a unique code or value.

How many characters is a character set?

Character sets used today in the US are generally 8-bit sets with 256 different characters, effectively doubling the ASCII set.

What does character set include?

A character set is made up of a series of code points, or the numeric representation of a character. For example, the code point for the letter A in international EBCDIC is 0xC1. A character set can also be called a coded character set, a code set, a code page, or an encoding.


2 Answers

Of course it can be ASCII's 65, if the execution character set is ASCII or a superset (such as UTF-8).

It doesn't say "it can't be ASCII", it says that it is something called "the execution character set".

like image 195
unwind Avatar answered Oct 22 '22 20:10

unwind


So, the standard allows that the "execution character set" is other things than ASCII or ASCII derivatives. One example would be the EBCDIC character set that IBM used for a long time (there's probably still machines about using EBCDIC, but I suspect anything built in the last 10-15 years wouldn't be using that). The encoding of characters in EBCDIC is completely different from ASCII.

So, expecting, in code, that the value of 'A' is any particular value is not portable. There are also a whole heap of other "common assumptions" that will fail - that there are no "holes" between A-Z, and that 'A'-'a' == 32 are both false in EBCDIC. At least the characters A-Z are in the correct order! ;)

like image 43
Mats Petersson Avatar answered Oct 22 '22 20:10

Mats Petersson