Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are C character literals ints instead of chars?

Tags:

c++

c

char

sizeof

In C++, sizeof('a') == sizeof(char) == 1. This makes intuitive sense, since 'a' is a character literal, and sizeof(char) == 1 as defined by the standard.

In C however, sizeof('a') == sizeof(int). That is, it appears that C character literals are actually integers. Does anyone know why? I can find plenty of mentions of this C quirk but no explanation for why it exists.

like image 317
Joseph Garvin Avatar asked Jan 11 '09 22:01

Joseph Garvin


People also ask

What are character literals in c?

A character literal contains a sequence of characters or escape sequences enclosed in single quotation mark symbols, for example 'c' . A character literal may be prefixed with the letter L, for example L'c' . A character literal without the L prefix is an ordinary character literal or a narrow character literal.

Can character literal be a plain character?

Character literals are enclosed in single quotes. For example, 'x' and can be stored in a simple variable of char type. A character literal can be a plain character (such as 'x'), an escape sequence (such as '\t'), or a universal character (such as '\u02C0').

What is the size of a character literal in c?

In C++ the size of the character literal is char. In C the type of character literal is integer (int). So in C the sizeof('a') is 4 for 32bit architecture, and CHAR_BIT is 8. But the sizeof(char) is one byte for both C and C++.

Are chars ints?

An int is required to be at least a 16 bits signed word, and to accept all values between -32767 and 32767. That means that an int can accept all values from a char, be the latter signed or unsigned.


3 Answers

discussion on same subject

"More specifically the integral promotions. In K&R C it was virtually (?) impossible to use a character value without it being promoted to int first, so making character constant int in the first place eliminated that step. There were and still are multi character constants such as 'abcd' or however many will fit in an int."

like image 124
Malx Avatar answered Nov 15 '22 22:11

Malx


The original question is "why?"

The reason is that the definition of a literal character has evolved and changed, while trying to remain backwards compatible with existing code.

In the dark days of early C there were no types at all. By the time I first learnt to program in C, types had been introduced, but functions didn't have prototypes to tell the caller what the argument types were. Instead it was standardised that everything passed as a parameter would either be the size of an int (this included all pointers) or it would be a double.

This meant that when you were writing the function, all the parameters that weren't double were stored on the stack as ints, no matter how you declared them, and the compiler put code in the function to handle this for you.

This made things somewhat inconsistent, so when K&R wrote their famous book, they put in the rule that a character literal would always be promoted to an int in any expression, not just a function parameter.

When the ANSI committee first standardised C, they changed this rule so that a character literal would simply be an int, since this seemed a simpler way of achieving the same thing.

When C++ was being designed, all functions were required to have full prototypes (this is still not required in C, although it is universally accepted as good practice). Because of this, it was decided that a character literal could be stored in a char. The advantage of this in C++ is that a function with a char parameter and a function with an int parameter have different signatures. This advantage is not the case in C.

This is why they are different. Evolution...

like image 33
John Vincent Avatar answered Nov 15 '22 23:11

John Vincent


I don't know the specific reasons why a character literal in C is of type int. But in C++, there is a good reason not to go that way. Consider this:

void print(int);
void print(char);

print('a');

You would expect that the call to print selects the second version taking a char. Having a character literal being an int would make that impossible. Note that in C++ literals having more than one character still have type int, although their value is implementation defined. So, 'ab' has type int, while 'a' has type char.

like image 27
Johannes Schaub - litb Avatar answered Nov 16 '22 00:11

Johannes Schaub - litb