According to C11 WG14 draft version N1570:
The header
<ctype.h>
declares several functions useful for classifying and mapping characters. In all cases the argument is anint
, the value of which shall be representable as anunsigned char
or shall equal the value of the macroEOF
. If the argument has any other value, the behavior is undefined.
Is it undefined behaviour?:
#include <ctype.h>
#include <limits.h>
#include <stdlib.h>
int main(void) {
char c = CHAR_MIN; /* let assume that char is signed and CHAR_MIN < 0 */
return isspace(c) ? EXIT_FAILURE : EXIT_SUCCESS;
}
Does the standard allow to pass char
to isspace()
(char
to int
)? In other words, is char
after conversion to int
representable as an unsigned char
?
Here's how wiktionary defines "representable":
Capable of being represented.
Is char
capable of being represented as unsigned char
? Yes. §6.2.6.1/4:
Values stored in non-bit-field objects of any other object type consist of n
×
CHAR_BIT
bits, where n is the size of an object of that type, in bytes. The value may be copied into an object of type unsigned char [n] (e.g., by memcpy); the resulting set of bytes is called the object representation of the value.
sizeof(char) == 1
therefore its object representation is unsigned char[1]
i.e., char
is capable of being represented as an unsigned char
. Where am I wrong?
Concrete example, I can represent [-2, -1, 0, 1]
as [0, 1, 2, 3]
. If I can't then why?
Related: According to §6.3.1.3 isspace((unsigned char)c)
is portable if INT_MAX >= UCHAR_MAX
otherwise it is implementation-defined.
What does representable in a type mean?
Re-formulated, a type is a convention for what the underlying bit-patterns mean. A value is thus representable in a type, if that type assigns some bit-pattern that meaning.
A conversion (which might need a cast), is a mapping from a value (represented with a specific type) to a value (possibly different) represented in the target type.
Under the given assumption (that char
is signed), CHAR_MIN
is certainly negative, and the text you quoted leaves no room for interpretation:
Yes, it is undefined behavior, as unsigned char
cannot represent any negative numbers.
If that assumption did not hold, your program would be well-defined, because CHAR_MIN
would be 0
, a valid value for unsigned char
.
Thus, we have a case where it is implementation-defined whether the program is undefined or well-defined.
As an aside, there is no guarantee that sizeof(int)>1
or INT_MAX >= CHAR_MAX
, so int
might not be able to represent all values possible for unsigned char
.
As conversions are defined to be value-preserving, a signed char
can always be converted to int
.
But if it was negative, that does not change the impossibility of representing a negative value as an unsigned char
. (The conversion is defined, as conversion from any integral type to any unsigned
integral type is always defined, though narrowing conversions need a cast.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With