Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Very surprising perfs of fprintf vs std::ofstream (fprintf is very slow)

I was running some benchmarks to find the most efficient way to write a huge array to a file in C++ (more than 1Go in ASCII).

So I compared std::ofstream with fprintf (see the switch I used below)

    case 0: {
        std::ofstream out(title, std::ios::out | std::ios::trunc);
        if (out) {
            ok = true;
            for (i=0; i<M; i++) {
                for (j=0; j<N; j++) {
                    out<<A[i][j]<<" ";
                }
                out<<"\n";
            }
            out.close();
        } else {
            std::cout<<"Error with file : "<<title<<"\n";
        }
        break;
    }
    case 1: {
        FILE *out = fopen(title.c_str(), "w");
        if (out!=NULL) {
            ok = true;
            for (i=0; i<M; i++) {
                for (j=0; j<N; j++) {
                    fprintf(out, "%d ", A[i][j]);
                }
                fprintf(out, "\n");
            }
            fclose(out);
        } else {
            std::cout<<"Error with file : "<<title<<"\n";
        }
        break;
    }

And my huge problem is that fprintf seems to be more thant 12x slower compared to std::ofstream. Do you have an idea of what is the origin of the problem in my code ? Or maybe std::ofstream is very optimized compared to fprintf ?

(and an other question : do you know another faster way to write a file)

Thank you very much

(detail : I was compiling with g++ -Wall -O3)

like image 231
Vincent Avatar asked Oct 24 '11 14:10

Vincent


4 Answers

fprintf("%d" requires runtime parsing of the format string, once per integer. ostream& operator<<(ostream&, int) is resolved by the compiler, once per compilation.

like image 164
MSalters Avatar answered Nov 08 '22 12:11

MSalters


Well, fprintf() does have to do a bit more work at runtime, since it has to parse and process the format string. However, given the size of your output file I would expect those differences to be of little consequence, and would expect the code to be I/O bound.

I therefore suspect that your benchmark is flawed in some way.

  1. Do you consistently get a 12x difference if you run the tests repeatedly?
  2. What happens to the timings if you reverse the order in which you run the tests?
  3. What happens if you call fsync()/sync() at the end?
like image 4
NPE Avatar answered Nov 08 '22 12:11

NPE


There is a file buffer in the ofstream, this may decrease the times accessing to the disk. in addition, fprintf is a function with variable parameters which will call some va_# functions, but ofstream won't.I think you can use fwrite() or putc() to have a test.

like image 2
YangG Avatar answered Nov 08 '22 11:11

YangG


Have you set sync_with_stdio somewhere upstream of the code you have shown?

While what you report is opposite that of what is empirically seen, most people think and believe what you see should be the norm. iostreams are type-safe, whereas the printf family of functions are variadic functions that have to infer the types of the va_list from the format specifier.

like image 1
Happy Green Kid Naps Avatar answered Nov 08 '22 13:11

Happy Green Kid Naps