Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiency of fopen, fclose

If iteratively writing to a file in a nested for loop, is there any difference in efficiency to open the file before the loop and close it after, rather than open and close within? See the following:

int main(){
    FILE *file1;
    char filename;
    int i, j, N, M;

    for(i=0; i<N; i++){
        file1=fopen(filename, "a");
        for(j=0; j<M; j++){
            fprintf(file1,"%d %d\n", i, j);
        }
        fclose(file1);
    }
return 1;
}

or

int main(){
    FILE *file1;
    char filename;
    int i, j, N, M;

    file1=fopen(filename, "a");
    for(i=0; i<N; i++){
        for(j=0; j<M; j++){
            fprintf(file1, "%d %d\n", i, j);
        }
    }
    fclose(file1);
    return 1;
}
like image 466
Mike M Avatar asked Dec 02 '22 15:12

Mike M


1 Answers

I did a quick benchmark to see if there's a significant difference. The code is slightly different to yours but it still shows the difference in efficiency. Also, I didn't bother to take in account of caching, etc...

You can see for yourself if it's significant.

The test program:

#include <stdio.h>
#include <stdlib.h>


#ifdef TEST1
void test(char *filename, int n) {
    int i;
    FILE *fp;

    for (i=0; i<n; i++) {
        fp = fopen(filename, "a");
        if (fp) {
            fprintf(fp, "%d\n", i);
            fclose(fp);
        }
    }
}
#else
void test(char *filename, int n) {
    int i;
    FILE *fp;

    fp = fopen(filename, "a");
    if (!fp)
        return;

    for (i=0; i<n; i++) {
        fprintf(fp, "%d\n", i);
    }

    fclose(fp);
}
#endif

int main(int argc, char *argv[]) {
    char *filename;
    int n;

    if (argc < 3)
        return -1;

    filename = argv[1];
    n = atoi(argv[2]);

    test(filename, n);

    return 0;
}

The compile flags and bench marking commands:

gcc -DTEST1 -Wall -O3 -o test1 test.c
gcc -DTEST2 -Wall -O3 -o test2 test.c

time ./test1 test.bin n; rm test.bin # where n is the number of runs
time ./test2 test.bin n; rm test.bin # where n is the number of runs

Machine is 2.2GHz Core i7, 8GB of RAM running OS X.

The results:

   n   |  test1  |  test2
-------+---------+---------
10     | 0.009s  | 0.006s
100    | 0.036s  | 0.006s
1000   | 0.340s  | 0.007s
10000  | 2.535s  | 0.011s
100000 | 24.509s | 0.041s

So in conclusion, is there a difference? Yes.

Is there a significant difference? Yes but only for large(ish) number of iterations.

Does it matter? It depends. How many iterations are you planning to do? Up until around 1000 iterations, the difference isn't likely to be noticeable by the user. Anything thing higher and you'll start to see some significant differences in running time between the two different implementations.

At the end of the day, if you can code for efficiency without too much effort, why deliberately use a less efficient algorithm?

like image 195
tangrs Avatar answered Dec 24 '22 03:12

tangrs