Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading all content from a text file - C

I am trying to read all content from a text file. Here is the code which I wrote.

#include <stdio.h>
#include <stdlib.h>

#define PAGE_SIZE 1024

static char *readcontent(const char *filename)
{
    char *fcontent = NULL, c;
    int index = 0, pagenum = 1;
    FILE *fp;
    fp = fopen(filename, "r");

    if(fp) {
        while((c = getc(fp)) != EOF) {
            if(!fcontent || index == PAGE_SIZE) {
                fcontent = (char*) realloc(fcontent, PAGE_SIZE * pagenum + 1);
                ++pagenum;
            }
            fcontent[index++] = c;
        }
        fcontent[index] = '\0';
        fclose(fp);
    }
    return fcontent;
}

static void freecontent(char *content)
{
    if(content) {
        free(content);
        content = NULL;
    }
}

This is the usage

int main(int argc, char **argv)
{
    char *content;
    content = readcontent("filename.txt");
    printf("File content : %s\n", content);
    fflush(stdout);
    freecontent(content);
    return 0;
}

Since I am new to C, I wonder whether this code looks perfect? Do you see any problems/improvements?

Compiler used : GCC. But this code is expected to be cross platform.

Any help would be appreciated.

Edit

Here is the updated code with fread and ftell.

static char *readcontent(const char *filename)
{
    char *fcontent = NULL;
    int fsize = 0;
    FILE *fp;

    fp = fopen(filename, "r");
    if(fp) {
        fseek(fp, 0, SEEK_END);
        fsize = ftell(fp);
        rewind(fp);

        fcontent = (char*) malloc(sizeof(char) * fsize);
        fread(fcontent, 1, fsize, fp);

        fclose(fp);
    }
    return fcontent;
}

I am wondering what will be the relative complexity of this function?

like image 533
Navaneeth K N Avatar asked Aug 01 '10 06:08

Navaneeth K N


3 Answers

You should try look into the functions fsize (About fsize, see update below) and fread. This could be a huge performance improvement.

Use fsize to get the size of the file you are reading. Use this size to do one alloc of memory only. (About fsize, see update below. The idea of getting the size of the file and doing one alloc is still the same).

Use fread to do block reading of the file. This is much faster than single charecter reading of the file.

Something like this:

long size = fsize(fp);
fcontent = malloc(size);
fread(fcontent, 1, size, fp);

Update

Not sure that fsize is cross platform but you can use this method to get the size of the file:

fseek(fp, 0, SEEK_END); 
size = ftell(fp);
fseek(fp, 0, SEEK_SET); 
like image 148
Martin Ingvar Kofoed Jensen Avatar answered Sep 22 '22 22:09

Martin Ingvar Kofoed Jensen


People often realloc to twice the existing size to get amortized constant time instead of linear. This makes the buffer no more than twice as large, which is usually okay, and you have the option of reallocating back down to the correct size after you're done.

But even better is to stat(2) for the file size and allocate once (with some extra room if the file size is volatile).

Also, why you don't either fgets(3) instead of reading character by character, or, even better, mmap(2) the entire thing (or the relevant chunk if it's too large for memory).

like image 37
Wang Avatar answered Sep 23 '22 22:09

Wang


It is probably slower and certainly more complex than:

while((c = getc(fp)) != EOF) {
    putchar(c);
}

which does the same thing as your code.

like image 41
msw Avatar answered Sep 24 '22 22:09

msw