Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading c file line by line using fgetc()

Tags:

c

file-io

This is how I've done it but I'm not sure this is the preferred idiom:

FILE *fp = fopen(argv[0], "r");
// handle fopen() returning NULL

while (!feof(fp)) {
    char buffer[80]; // statically allocated, may replace this later with some more sophisticated approach
    int num_chars = 0;

    for (int ch = fgetc(fp); ch != EOF && ch != '\n'; ch = fgetc()) {
        buffer[num_chars++] = ch;
    }

    // null-terminate the string
    buffer[num_chars] = '\0';

    printf("%s\n", buffer);
}

Is this okay, any suggestions to improve this?

like image 596
helpermethod Avatar asked Nov 27 '10 19:11

helpermethod


2 Answers

If you are not going to use fgets() (perhaps because you want to remove the newline, or you want to deal with "\r", "\n" or "\r\n" line endings, or you want to know how many characters were read), you can use this as a skeleton function:

int get_line(FILE *fp, char *buffer, size_t buflen)
{
    char *end = buffer + buflen - 1; /* Allow space for null terminator */
    char *dst = buffer;
    int c;
    while ((c = getc(fp)) != EOF && c != '\n' && dst < end)
        *dst++ = c;
    *dst = '\0';
    return((c == EOF && dst == buffer) ? EOF : dst - buffer);
}

It recognizes only newline as the end of line; it drops the newline. It does not overflow the buffer; it does not discard excess characters, so if called upon to read a very long line, it will read the line in chunks; it returns the number of characters read. If you need to distinguish between overflow and a line that happens to be the length of the buffer - 1, then you probably need to preserve the newline - with consequential changes in the code:

int get_line(FILE *fp, char *buffer, size_t buflen)
{
    char *end = buffer + buflen - 1; /* Allow space for null terminator */
    char *dst = buffer;
    int c;
    while ((c = getc(fp)) != EOF && dst < end)
    {
        if ((*dst++ = c) == '\n')
            break;
    }
    *dst = '\0';
    return((c == EOF && dst == buffer) ? EOF : dst - buffer);
}

There are endless minor variants on this, such as discarding any excess characters if the line has to be truncated. If you want to handle DOS, (old) Mac or Unix line endings, then borrow a leaf out of the CSV code from "The Practice of Programming" by Kernighan & Pike (an excellent book) and use:

static int endofline(FILE *ifp, int c)
{
    int eol = (c == '\r' || c == '\n');
    if (c == '\r')
    {
        c = getc(ifp);
        if (c != '\n' && c != EOF)
            ungetc(c, ifp);
    }
    return(eol);
}

Then you can use that in place of the c != '\n' test:

int get_line(FILE *fp, char *buffer, size_t buflen)
{
    char *end = buffer + buflen - 1; /* Allow space for null terminator */
    char *dst = buffer;
    int c;
    while ((c = getc(fp)) != EOF && !endofline(fp, c) && dst < end)
        *dst++ = c;
    *dst = '\0';
    return((c == EOF && dst == buffer) ? EOF : dst - buffer);
}

The other alternative way of dealing with the whole process is using fread() and fwrite():

void copy_file(FILE *in, FILE *out)
{
    char buffer[4096];
    size_t nbytes;
    while ((nbytes = fread(buffer, sizeof(char), sizeof(buffer), in)) != 0)
    {
        if (fwrite(buffer, sizeof(char), nbytes, out) != nbytes)
            err_error("Failed to write %zu bytes\n", nbytes);
    }
}

In context, you'd open the file and check it for validity, then call:

copy_file(fp, stdout);
like image 143
Jonathan Leffler Avatar answered Oct 11 '22 03:10

Jonathan Leffler


You're risking buffer overflow if the user inputs 80 characters or more.

I'm with ThiefMaster: you should use fgets(), instead. Read the input into a buffer that's larger than any legitimate input and then check that the last character is a newline.

like image 42
Steve Emmerson Avatar answered Oct 11 '22 05:10

Steve Emmerson