Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing unsigned char and EOF

when the following code is compiled it goes into an infinite loop:

int main()
{
    unsigned char  ch;
    FILE *fp;
    fp = fopen("abc","r");
    if(fp==NULL)
    {
        printf("Unable to Open");
        exit(1);
    }
    while((ch = fgetc(fp))!=EOF)
    printf("%c",ch);
    fclose(fp);
    printf("\n",ch);
    return 0;
}

The gcc Compiler also gives warning on compilation

abc.c:13:warning: comparison is always true due to limited range of data type

the code runs fine when unsigned char is replaced by char or int as expected i.e. it terminates.
But the code also runs fine for unsigned int as well. as i have i have read in EOF is defines as -1 in stdio.h then why does this code fails for unsigned char but runs fine for unsigned int.

like image 620
Amol Sharma Avatar asked Dec 21 '11 08:12

Amol Sharma


People also ask

Is EOF a char or int?

The analyzer detected that the EOF constant is compared with a variable of type 'char' or 'unsigned char'. Such comparison implies that some of the characters won't be processed correctly. That is, EOF is actually but the value '-1' of type 'int'.

How do you check if a string is EOF?

There is no way of comparing a string with EOF; it is not a char value, but a condition on the stream (here, stdin ). However the getchar() and alike will return the read char value as unsigned char cast to an int , or EOF if end-of-file was reached, or an error occurred.

What is EOF in char?

EOF is not a character, but a state of the filehandle. While there are there are control characters in the ASCII charset that represents the end of the data, these are not used to signal the end of files in general. For example EOT (^D) which in some cases almost signals the same.


2 Answers

The golden rule for writing this line is

   while ((ch = fgetc(stdin)) != EOF)

ch should be int .Your cute trick of making ch unsigned fails because EOF is a signed int quantity.

Ok, let's now go into the depth......

Step 1:

ch=fgetc(fp)

fgetc() returns -1 (a signed int). By the golden rules of C ch gets the last octet of bits which is all 1's. And hence the value 255. The byte pattern of ch after the execution of

ch = fgetc(fp); 

would thus be

11111111

Step 2:

ch != EOF

Now EOF is a signed integer and ch is an unsigned char ...

Again I refer to the golden rule of C ... the smaller guy ch is converted to big size int before comparision so its byte pattern is now

00000000000000000000000011111111 = (255)10

while EOF is

11111111111111111111111111111111 = (-1)10

There is no way they can be equal....... Hence the statement to steer the following while-loop

while ((ch = fgetc(stdin)) != EOF)

will never evaluate to false ...

And hence the infinite loop .

like image 174
bashrc Avatar answered Sep 30 '22 13:09

bashrc


There are several implicit conversions going on. They aren't really relevant to the specific warning, but I included them in this answer to show what the compiler really does with that expression.

  • ch in your example is of type unsigned char.
  • EOF is guaranteed to be of type int (C99 7.19.1).

So the expression is equivalent to

(unsigned char)ch != (int)EOF

The integer promotion rules in C will implicitly convert the unsigned char to unsigned int:

(unsigned int)ch != (int)EOF

Then the balancing rules (aka the usual arithmetic conversions) in C will implicitly convert the int to unsigned int, because each operand must have the same type:

(unsigned int)ch != (unsigned int)EOF

On your compiler EOF is likely -1:

(unsigned int)ch != (unsigned int)-1

which, assuming 32-bit CPU, is the same as

(unsigned int)ch != 0xFFFFFFFFu

A character can never have such a high value, hence the warning.

like image 42
Lundin Avatar answered Sep 30 '22 12:09

Lundin