After reading the 32 bit unsigned multiply on 64 bit causing undefined behavior? question here on StackOverflow, I began to ponder whether typical arithmetic operations on small unsigned types could lead to undefined behavior according to the C99 standard.
For example, take the following code:
#include <limits.h>
...
unsigned char x = UCHAR_MAX;
unsigned char y = x + 1;
The x
variable is initialized to the maximum magnitude for the unsigned char
data type. The next line is the issue: the value x + 1
is greater than UCHAR_MAX
and cannot be stored in the unsigned char
variable y
.
I believe the following is what actually occurs.
x
is first promoted to data type int
(section 6.3.1.1/2), then x + 1
is evaluated as data type int
.Suppose there is an implementation where INT_MAX
and UCHAR_MAX
are the same -- x + 1
would result in a signed integer overflow. Does this mean that incrementing the variable x
, despite being an unsigned integer type, can lead to undefined behavior due to a possible signed integer overflow?
And unfortunately, signed integral overflow is undefined behavior. It doesn’t matter that overflow of unsigned integral types is well-defined behavior in C and C++. No multiplication of values of type unsigned short ever occurs in this function.
They will usually be promoted to type int during operations and comparisons, and so they will be vulnerable to all the undefined behavior of the signed type int. They won’t be protected by any well-defined behavior of the original unsigned type, since after promotion the types are no longer unsigned. Sometimes you really do need unsigned integers.
They will usually be promoted to type int during operations and comparisons, and so they will be vulnerable to all the undefined behavior of the signed type int. They won’t be protected by any well-defined behavior of the original unsigned type, since after promotion the types are no longer unsigned.
An interesting consequence of the potential for undefined behavior in Figure 4 is that any compiler would be within its rights to generate “optimized” object code for the function (if the static_assert succeeds) that is very fast and almost certainly unintended by the programmer, equivalent to
By my reading of the standard, an implementation which used a 15-bit char
could legally store int
as a 15-bit magnitude and use a second 15-bit word to store the sign along with 14 bits of padding; in that case, an unsigned char
would hold values 0 to 32,767 and an int
would hold values from -32,767 to +32,767. Adding 1 to (unsigned char)32767
would indeed be undefined behavior. A similar situation could arise with any larger char
size if 32,767 was replaced with UCHAR_MAX
.
Such a situation is unlikely, however, compared with the real-world problems associated with unsigned integer multiplication alluded to in the other post.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With