Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does "representable" mean in C11?

According to C11 WG14 draft version N1570:

The header <ctype.h> declares several functions useful for classifying and mapping characters. In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.

Is it undefined behaviour?:

#include <ctype.h>
#include <limits.h>
#include <stdlib.h>

int main(void) {
  char c = CHAR_MIN; /* let assume that char is signed and CHAR_MIN < 0 */
  return isspace(c) ? EXIT_FAILURE : EXIT_SUCCESS;
}

Does the standard allow to pass char to isspace() (char to int)? In other words, is char after conversion to int representable as an unsigned char?


Here's how wiktionary defines "representable":

Capable of being represented.

Is char capable of being represented as unsigned char? Yes. §6.2.6.1/4:

Values stored in non-bit-field objects of any other object type consist of n × CHAR_BIT bits, where n is the size of an object of that type, in bytes. The value may be copied into an object of type unsigned char [n] (e.g., by memcpy); the resulting set of bytes is called the object representation of the value.

sizeof(char) == 1 therefore its object representation is unsigned char[1] i.e., char is capable of being represented as an unsigned char. Where am I wrong?

Concrete example, I can represent [-2, -1, 0, 1] as [0, 1, 2, 3]. If I can't then why?


Related: According to §6.3.1.3 isspace((unsigned char)c) is portable if INT_MAX >= UCHAR_MAX otherwise it is implementation-defined.

like image 278
jfs Avatar asked Sep 10 '14 23:09

jfs


1 Answers

What does representable in a type mean?

Re-formulated, a type is a convention for what the underlying bit-patterns mean. A value is thus representable in a type, if that type assigns some bit-pattern that meaning.

A conversion (which might need a cast), is a mapping from a value (represented with a specific type) to a value (possibly different) represented in the target type.


Under the given assumption (that char is signed), CHAR_MIN is certainly negative, and the text you quoted leaves no room for interpretation:
Yes, it is undefined behavior, as unsigned char cannot represent any negative numbers.

If that assumption did not hold, your program would be well-defined, because CHAR_MIN would be 0, a valid value for unsigned char.

Thus, we have a case where it is implementation-defined whether the program is undefined or well-defined.


As an aside, there is no guarantee that sizeof(int)>1 or INT_MAX >= CHAR_MAX, so int might not be able to represent all values possible for unsigned char.

As conversions are defined to be value-preserving, a signed char can always be converted to int.
But if it was negative, that does not change the impossibility of representing a negative value as an unsigned char. (The conversion is defined, as conversion from any integral type to any unsigned integral type is always defined, though narrowing conversions need a cast.)

like image 92
Deduplicator Avatar answered Oct 11 '22 17:10

Deduplicator