Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to confuse EOF with a normal byte value when using fgetc?

We often use fgetc like this:

int c;
while ((c = fgetc(file)) != EOF)
{
    // do stuff
}

Theoretically, if a byte in the file has the value of EOF, this code is buggy - it will break the loop early and fail to process the whole file. Is this situation possible?

As far as I understand, fgetc internally casts a byte read from the file to unsigned char and then to int, and returns it. This will work if the range of int is greater than that of unsigned char.

What happens if it's not (probably then sizeof(int)=1)?

  • Will fgetc read a legitimate data equal to EOF from a file sometimes?
  • Will it alter the data it read from the file to avoid the single value EOF?
  • Will fgetc be an unimplemented function?
  • Will EOF be of another type, like long?

I could make my code fool-proof by an extra check:

int c;
for (;;)
{
    c = fgetc(file);
    if (feof(file))
        break;
    // do stuff
}

It is necessary if I want maximum portability?

like image 231
anatolyg Avatar asked Sep 17 '15 23:09

anatolyg


People also ask

Does fgetc return EOF?

fgetc returns the character read as an int or returns EOF to indicate an error or end of file. fgetwc returns, as a wint_t , the wide character that corresponds to the character read or returns WEOF to indicate an error or end of file.

How many bytes fgetc read?

fgetc read exactly one byte. A character type ( signed char , char , unsigned char and qualified versions) contains CHAR_BIT bits ( <limits. h> ), which is a constant greater than 8 .

What does fgetc return in C?

fgetc() is used to obtain input from a file single character at a time. This function returns the ASCII code of the character read by the function. It returns the character present at position indicated by file pointer.


1 Answers

Yes, c = fgetc(file); if (feof(file)) does work for maximum portability. It works in general and also when the unsigned char and int have the same number of unique values. This occurs on rare platforms with char, signed char, unsigned char, short, unsigned short, int, unsigned all using the same bit width and width of range.

Note that feof(file)) is insufficient. Code should also check for ferror(file).

int c;
for (;;)
{
    c = fgetc(file);
    if (c == EOF) {
      if (feof(file)) break;
      if (ferror(file)) break;
    }
    // do stuff
}
like image 98
chux - Reinstate Monica Avatar answered Oct 02 '22 04:10

chux - Reinstate Monica