Quoting from Kernighan and Ritchie's 'The C Programming Language' Page 16 -
#include<stdio.h>
main()
{
int c;
c = getchar();
while(c!=EOF)
{
putchar(c);
c = getchar();
}
getchar();
return 0;
}
"The type
char
is specifically meant for storing such character data, but any integer type can be used. We usedint
for a subtle but important reason. The problem is distinguishing the end of the input from valid data. The solution is thatgetchar
returns a distinctive value when there is no more input, a value that cannot be confused with any real character. This value is calledEOF
, for "end of file". We must declarec
to be a type big enough to hold any value thatgetchar
returns. We can't usechar
sincec
must be big enough to holdEOF
in addition to any possiblechar
. Therefore we useint
.".
I looked up in stdio.h, it says #define EOF (-1)
The book conclusively states that char
cannot be used whereas this program "works just fine" (See EDIT) with c
as char
data type as well. What is going on? Can anyone explain in terms of bits and signed values?
EDIT:
As Oli mentioned in the answer, the program cannot distinguish between EOF
and 255
. So it will not work fine. I want to know what's happening - Are you saying that when we do the comparison c!=EOF, the EOF value gets cast to a char value = 255 (11111111 in binary; i.e. the bits 0 through 7 of EOF when written in 2's complement notation)?
The analyzer detected that the EOF constant is compared with a variable of type 'char' or 'unsigned char'. Such comparison implies that some of the characters won't be processed correctly. That is, EOF is actually but the value '-1' of type 'int'.
EOF is of type int . Since it's negative, it's by definition a value that an unsigned char can't hold. But since it's possible for unsigned char and int to have the same size (if CHAR_BIT >= 16 ), it may not be possible to distinguish between EOF and a valid result from getchar() .
In computing, end-of-file (EOF) is a condition in a computer operating system where no more data can be read from a data source. The data source is usually called a file or stream.
The EOF in C/Linux is control^d on your keyboard; that is, you hold down the control key and hit d. The ascii value for EOF (CTRL-D) is 0x05 as shown in this ascii table . Typically a text file will have text and a bunch of whitespaces (e.g., blanks, tabs, spaces, newline characters) and terminate with an EOF.
getchar
result is the input character converted to unsigned char
and then to int
or EOF
i.e. it will be in the -1 — 255 range that's 257 different values, you can't put that in an 8 bit char
without merging two of them. Practically either you'll mistake EOF
as a valid character (that will happen if char
is unsigned) or will mistake another character as EOF
(that will happen if char
is signed).
Note: I'm assuming an 8 bit char
type, I know this assumption isn't backed up by the standard, it is just by far the most common implementation choice.
Your program doesn't work fine; it won't be able to distinguish between EOF
and 255
.
The reason it appears to work correctly is because char
is probably signed
on your platform, so it's still capable of representing -1
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With