The standard is clear: when performing arithmetic on an integral type smaller than int
, the integer is first promoted to a signed int
, unless int
cannot represent the full range of values for the original type, in which case the promotion is to unsigned int
instead.
My question is: what is (was?) the motivation for this policy? Why are unsigned types promoted to signed int
, rather than always to unsigned int
?
Of course, in practice there's almost no difference, since the underlying assembly instruction is the same (just a zero-extension), but there is a key downside to promotion to signed int
, without obvious upside, since overflows are UB in signed arithmetic but well-defined in unsigned arithmetic.
Were there historical reasons for preferring signed int
? Are there architectures that don't use two's complement arithmetic where promotion of small unsigned types to signed int
rather than unsigned int
is easier/faster?
EDIT: I would think it's obvious, but here I'm looking for facts (i.e. some documentation or references that explain the design decision), not "primarily opinion-based" speculation.
This is addressed in the ANSI C Rationale (the link is to the relevant section, 3.2.1.1). It was, to some extent, an arbitrary choice that could have gone either way, but there are reasons for the choice that was made.
Since the publication of K&R, a serious divergence has occurred among implementations of C in the evolution of integral promotion rules. Implementations fall into two major camps, which may be characterized as unsigned preserving and value preserving. The difference between these approaches centers on the treatment of
unsigned char
andunsigned short
, when widened by the integral promotions, but the decision has an impact on the typing of constants as well (see §3.1.3.2).The unsigned preserving approach calls for promoting the two smaller unsigned types to
unsigned int
. This is a simple rule, and yields a type which is independent of execution environment.The value preserving approach calls for promoting those types to
signed int
, if that type can properly represent all the values of the original type, and otherwise for promoting those types tounsigned int
. Thus, if the execution environment representsshort
as something smaller thanint
,unsigned short
becomesint
; otherwise it becomesunsigned int
.
[SNIP]
The unsigned preserving rules greatly increase the number of situations where
unsigned int
confrontssigned int
to yield a questionably signed result, whereas the value preserving rules minimize such confrontations. Thus, the value preserving rules were considered to be safer for the novice, or unwary, programmer. After much discussion, the Committee decided in favor of value preserving rules, despite the fact that the UNIX C compilers had evolved in the direction of unsigned preserving.
(I recommend reading the full section. I just didn't want to quote the whole thing here.)
An interesting portion of the Rationale snipped from Keith Thompson's answer:
Both schemes give the same answer in the vast majority of cases, and both give the same effective result in even more cases in implementations with twos-complement arithmetic and quiet wraparound on signed overflow --- that is, in most current implementations. In such implementations, differences between the two only appear when these two conditions are both true:
An expression involving an unsigned char or unsigned short produces an int-wide result in which the sign bit is set: i.e., either a unary operation on such a type, or a binary operation in which the other operand is an int or ``narrower'' type.
The result of the preceding expression is used in a context in which its signedness is significant:
- sizeof(int) < sizeof(long) and it is in a context where it must be widened to a long type, or
- it is the left operand of the right-shift operator (in an implementation where this shift is defined as arithmetic), or
- it is either operand of /, %, <, <=, >, or >=.
Note that the Standard imposes no requirements on how an implementation processes any situation where quiet-wraparound behavior would be relevant. The clear implication is that the authors of the Standard expected that commonplace implementations for two's-complement platforms would behave as described above with or without a mandate, absent a compelling reason to do otherwise, and thus there was no need to mandate that they do so. While it would seem unlikely that they considered the possibility that a 32-bit implementation given something like:
unsigned mul(unsigned short x, unsigned short y) { return x*y; }
might aggressively exploit the fact that it wasn't required to accommodate values of x
greater than 2147483647/y
, some compilers for modern platforms treat the lack of requirement as an invitation to generate code that will malfunction in those cases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With