Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why putchar, toupper, tolower, etc. take a int instead of a char?

In C, strings are arrays of char (char *) and characters are usually stored in char. I noticed that some functions from the libC are taking as argument integers instead of a char.

For instance, let's take the functions toupper() and tolower() that both use int. The man page says:

If c is not an unsigned char value, or EOF, the behavior of these functions is undefined.

My guess is that with a int, toupper and tolower are able to deal with unsigned char and EOF. But in fact EOF is in practice (is there any rule about its value?) a value that can be stored with a char, and since those functions won't transform EOF into something else, I'm wondering why toupper does not simply take a char as argument.

In any case why do we need to accept something that is not a character (such as EOF)? Could someone provide me a relevant use case?

This is similar with fputc or putchar, that also take a int that is converted into an unsigned char anyway.

I am looking for the precise motivations for that choice. I want to be convinced, I don't want to answer that I don't know if someone ask me one day.

like image 794
Maxime Chéramy Avatar asked Jul 03 '13 16:07

Maxime Chéramy


People also ask

What does int putchar do in C programming?

The C library function int putchar (int char) writes a character (an unsigned char) specified by the argument char to stdout.

What is the use of toupper function in C?

toupper () function in C Last Updated: 01-10-2018 The toupper () function is used to convert lowercase alphabet to uppercase. i.e. If the character passed is a lowercase alphabet then the toupper () function converts a lowercase alphabet to an uppercase alphabet.

What is the next biggest type to an int argument?

In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined. "The next biggest type" would actually be short. But, when these were invented a short would be promoted to int just like a char would.

What is the function of TOLOWER in C?

tolower converts an uppercase letter (A-Z) to a lowercase letter (a-z). Function prototype of tolower( ) int tolower( int ch ); This function returns ch as a lowercase letter if ch is a uppercase letter. If the argument is the lowercase letter, it returns the argument unchanged. toupper( )


2 Answers

C11 7.4

The header <ctype.h> declares several functions useful for classifying and mapping characters. In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.

C11 7.21.1

EOF

which expands to an integer constant expression, with type int and a negative value, ...

The C standard explicitly states that EOF is always an int with negative value. And furthermore, the signedness of the default char type is implementation-defined, so it may be unsigned and not able to store a negative value:

C11 6.2.5

If a member of the basic execution character set is stored in a char object, its value is guaranteed to be nonnegative. If any other character is stored in a char object, the resulting value is implementation-defined but shall be within the range of values that can be represented in that type.

like image 50
Lundin Avatar answered Oct 12 '22 23:10

Lundin


BITD a coding method included:

/* example */
int GetDecimal() {
  int sum = 0;
  int ch;
  while (isdigit(ch = getchar())) { /* isdigit(EOF) return 0 */
    sum *= 10;
    sum += ch - '0';
    }
  ungetc(ch, stdin);  /* If c is EOF, operation fails and the input stream is unchanged. */
  return sum;
}

ch with the value of EOF then could be used in various functions like isalpha() , tolower().

This style caused problems with putchar(EOF) which I suspect did the same as putchar(255).

The method is discouraged today for various reasons. Various models like the following are preferred.

int GetDecimal() {
  int ch;
  while (((ch = getchar()) != EOF)) && isdigit(ch)) {
    ...
  }
  ...
}
like image 27
chux - Reinstate Monica Avatar answered Oct 12 '22 23:10

chux - Reinstate Monica