Many answers to similar questions point out that it is so due to the standard. But, I cannot understand the reasoning behind this decision by the standard setters.
From my understanding an unsigned char
does not store the value in 2's complement form. So, I don't see a situation where let's say XORing two unsigned chars
would produce unexpected behavior. Therefore, promoting them to int
just seems like a waste of space (in most cases) and CPU cycles.
Moreover, why int
? If a variable is being declared as unsigned
, clearly the unsignedness is important to the programmer, therefore a promotion to an unsigned int
would still make more sense than an int
, in my opinion.
[EDIT #1] As stated out in the comments, promotion to unsigned int
will take place if an int
cannot sufficiently accommodate the value in the unsigned char
.
[EDIT #2] To clarify the question, if it is about the performance benefit of operating over int
than char
, then why is it in the standard? This could have been given as a suggestion to compiler designers for better optimization. Now, if someone were to design a compiler which didn't do this that would make their compiler as one not adhering to the C/C++ standard fully, even though, hypothetically this compiler did support all other required features of the language. In a nutshell, I cannot figure out a reason for why I cannot operate directly over unsigned chars
, therefore the requirement to promote them to ints
, seems unnecessary. Can you give me an example which proves this wrong?
Both can store 256 different values, but signed integers use half of their range for negative numbers, whereas unsigned integers can store positive numbers that are twice as large. An n-bit unsigned variable has a range of 0 to (2n)-1.
Unsigned integers are used when we know that the value that we are storing will always be non-negative (zero or positive). Note: it is almost always the case that you could use a regular integer variable in place of an unsigned integer.
Unsigned char must be used for accessing memory as a block of bytes or for small unsigned integers. Signed char must be used for small signed integers and simple char must be used only for ASCII characters and strings.
A signed integer is a 32-bit datum that encodes an integer in the range [-2147483648 to 2147483647]. An unsigned integer is a 32-bit datum that encodes a nonnegative integer in the range [0 to 4294967295]. The signed integer is represented in twos complement notation.
You can find this document on-line: Rationale for International Standard - Programming Languages - C (Revision 5.10, 2003).
Chapter 6.3 (p. 44 - 45) is about conversions
Between the publication of K&R and the development of C89, a serious divergence had occurred among implementations in the evolution of integer promotion rules. Implementations fell into two major camps which may be characterized as unsigned preserving and value preserving.
The difference between these approaches centered on the treatment of
unsigned char
andunsigned short
when widened by the integer promotions, but the decision had an impact on the typing of constants as well (see §6.4.4.1).The unsigned preserving approach calls for promoting the two smaller unsigned types to
unsigned int
. This is a simple rule, and yields a type which is independent of execution environment.The value preserving approach calls for promoting those types to
signed int
if that type can properly represent all the values of the original type, and otherwise for promoting those types tounsigned int
.Thus, if the execution environment represents
short
as something smaller thanint
,unsigned short
becomesint
; otherwise it becomesunsigned int
. Both schemes give the same answer in the vast majority of cases, and both give the same effective result in even more cases in implementations with two's complement arithmetic and quiet wraparound on signed overflow - that is, in most current implementations. In such implementations, differences between the two only appear when these two conditions are both true:
An expression involving an
unsigned char
orunsigned short
produces anint
-wide result in which the sign bit is set, that is, either a unary operation on such a type, or a binary operation in which the other operand is anint
or “narrower” type.The result of the preceding expression is used in a context in which its signedness is significant:
•
sizeof(int) < sizeof(long)
and it is in a context where it must be widened to a long type, or• it is the left operand of the right-shift operator in an implementation where this shift is defined as arithmetic, or
• it is either operand of /, %, <, <=, >, or >=.
In such circumstances a genuine ambiguity of interpretation arises. The result must be dubbed questionably signed, since a case can be made for either the signed or unsigned interpretation. Exactly the same ambiguity arises whenever an
unsigned int
confronts asigned int
across an operator, and thesigned int
has a negative value. Neither scheme does any better, or any worse, in resolving the ambiguity of this confrontation. Suddenly, the negativesigned int
becomes a very largeunsigned int
, which may be surprising, or it may be exactly what is desired by a knowledgeable programmer. Of course, all of these ambiguities can be avoided by a judicious use of casts.One of the important outcomes of exploring this problem is the understanding that high-quality compilers might do well to look for such questionable code and offer (optional) diagnostics, and that conscientious instructors might do well to warn programmers of the problems of implicit type conversions.
The unsigned preserving rules greatly increase the number of situations where
unsigned int
confrontssigned int
to yield a questionably signed result, whereas the value preserving rules minimize such confrontations. Thus, the value preserving rules were considered to be safer for the novice, or unwary, programmer. After much discussion, the C89 Committee decided in favor of value preserving rules, despite the fact that the UNIX C compilers had evolved in the direction of unsigned preserving.QUIET CHANGE IN C89
A program that depends upon unsigned preserving arithmetic conversions will behave differently, probably without complaint. This was considered the most serious semantic change made by the C89 Committee to a widespread current practice.
For reference, you can find more details about those conversions updated to C11 in this answer by Lundin.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With