The code below for testing endianness is expected to have implementation defined behavior:
int is_little_endian(void) {
int x = 1;
char *p = (char*)&x;
return *p == 1;
}
But is it possible that it may have undefined behavior on purposely contrived architectures? For example could the first byte of the representation of an int
with value 1
(or another well chosen value) be a trap value for the char
type?
As noted in comments, the type unsigned char
would not have this issue as it cannot have trap values, but this question specifically concerns the char
type.
Per C 2018 6.2.5 15, char
behaves as either signed char
or unsigned char
. Suppose it is signed char
. 6.2.6.2 2 discusses signed integer types, including signed char
. At the end of this paragraph, it says:
Which of these [sign and magnitude, two’s complement, or ones’ complement] applies is implementation-defined, as is whether the value with sign bit 1 and all value bits zero (for the first two), or with sign bit and all value bits 1 (for ones’ complement), is a trap representation or a normal value.
Thus, this paragraph allows signed char
to have a trap representation. However, the paragraph in the standard that says accessing trap representations may have undefined behavior, 6.2.6.1 5, specifically excludes character types:
Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined. Such a representation is called a trap representation.
Thus, although char
may have trap representations, there is no reason we should not be able to access it. There is then the question of what happens if we use the value in an expression? If a char
has a trap representation, it does not represent a value. So attempting to compare it to 1 in *p == 1
does not seem to have a defined behavior.
The specific value of 1 in an int
will not result in a trap representation in char
for any normal C implementation, as the 1 will be in the “rightmost” (lowest valued) bit of some byte of the int
, and no normal C implementation puts the sign bit of a char
in the bit in that position. However, the C standard apparently does not prohibit such an arrangement, so, theoretically, an int
with value 1 might be encoded with bits 00000001 in one of its bytes, and those bits might be a trap representation for a char
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With