Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using int for character types when comparing with EOF

Tags:

c

Quoting from Kernighan and Ritchie's 'The C Programming Language' Page 16 -

#include<stdio.h>

main()
{
int c;
c = getchar();

while(c!=EOF)
{
    putchar(c);
    c = getchar();
} 

getchar();
return 0;
}

"The type char is specifically meant for storing such character data, but any integer type can be used. We used int for a subtle but important reason. The problem is distinguishing the end of the input from valid data. The solution is that getchar returns a distinctive value when there is no more input, a value that cannot be confused with any real character. This value is called EOF, for "end of file". We must declare c to be a type big enough to hold any value that getchar returns. We can't use char since c must be big enough to hold EOF in addition to any possible char. Therefore we use int.".

I looked up in stdio.h, it says #define EOF (-1)

The book conclusively states that char cannot be used whereas this program "works just fine" (See EDIT) with c as char data type as well. What is going on? Can anyone explain in terms of bits and signed values?

EDIT:
As Oli mentioned in the answer, the program cannot distinguish between EOF and 255. So it will not work fine. I want to know what's happening - Are you saying that when we do the comparison c!=EOF, the EOF value gets cast to a char value = 255 (11111111 in binary; i.e. the bits 0 through 7 of EOF when written in 2's complement notation)?

like image 342
Vikesh Avatar asked Dec 11 '11 12:12

Vikesh


People also ask

Is EOF a char or int?

The analyzer detected that the EOF constant is compared with a variable of type 'char' or 'unsigned char'. Such comparison implies that some of the characters won't be processed correctly. That is, EOF is actually but the value '-1' of type 'int'.

Can Int be EOF?

EOF is of type int . Since it's negative, it's by definition a value that an unsigned char can't hold. But since it's possible for unsigned char and int to have the same size (if CHAR_BIT >= 16 ), it may not be possible to distinguish between EOF and a valid result from getchar() .

What is the data type of EOF?

In computing, end-of-file (EOF) is a condition in a computer operating system where no more data can be read from a data source. The data source is usually called a file or stream.

How do you signify EOF?

The EOF in C/Linux is control^d on your keyboard; that is, you hold down the control key and hit d. The ascii value for EOF (CTRL-D) is 0x05 as shown in this ascii table . Typically a text file will have text and a bunch of whitespaces (e.g., blanks, tabs, spaces, newline characters) and terminate with an EOF.


2 Answers

getchar result is the input character converted to unsigned char and then to int or EOF i.e. it will be in the -1 — 255 range that's 257 different values, you can't put that in an 8 bit char without merging two of them. Practically either you'll mistake EOF as a valid character (that will happen if char is unsigned) or will mistake another character as EOF (that will happen if char is signed).

Note: I'm assuming an 8 bit char type, I know this assumption isn't backed up by the standard, it is just by far the most common implementation choice.

like image 194
AProgrammer Avatar answered Sep 23 '22 03:09

AProgrammer


Your program doesn't work fine; it won't be able to distinguish between EOF and 255.

The reason it appears to work correctly is because char is probably signed on your platform, so it's still capable of representing -1.

like image 33
Oliver Charlesworth Avatar answered Sep 24 '22 03:09

Oliver Charlesworth