Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is fwrite faster than WriteFile in windows?

I have always thought that WriteFile is more efficient than fwrite, because fwrite calls down to WriteFile internally, but the following test code show me that fwrite is faster significantly than WriteFile.

fwrite costs 2 milliseconds while WriteFile need 27000(FILE_ATTRIBUTE_NORMAL), both flushes after every write call. If I call WriteFile with FILE_FLAG_WRITE_THROUGH, and comment the FlushFileBuffers(wfile) line , WriteFile will be faster, it costs 800.

So Is it really that fwrite calls down to WriteFile? What makes such a huge difference? How does fwrite work internally? How can I write data to file with API more efficiently than fwrite?(unbufferd, synchronous).

   #include <Windows.h>
   #include <stdio.h>
   #include <iostream>

   int main() {
     FILE* cfile = fopen("file1.txt", "w");
     HANDLE wfile = CreateFile("file2.txt", GENERIC_WRITE, FILE_SHARE_READ, NULL, CREATE_ALWAYS, 
           /*FILE_ATTRIBUTE_NORMAL*/FILE_FLAG_WRITE_THROUGH, NULL);
     DWORD written = 0;

     DWORD start_time, end_time;
     char * text = "test message ha ha ha ha";
     int size = strlen(text);
     int times = 999;

     start_time = timeGetTime();
     for(int i = 0; i < times; ++i) {
       fwrite(text, 1, size, cfile);
       fflush(cfile);
     }
     end_time = timeGetTime();
     std::cout << end_time - start_time << '\n';

     start_time = timeGetTime();
     for(int i = 0; i < times; ++i) {
         WriteFile(wfile, text, size, &written, NULL);
         //FlushFileBuffers(wfile);
     }
     end_time = timeGetTime();
     std::cout << end_time - start_time << std::endl;

     system("pause");
     return 0;
   }

Update: Thanks for the answers, here is the answer: see VS directory\VS\crt\src\fflush.c:

    //fflush.c
    int __cdecl _fflush_nolock (FILE *str) {
        //irrelevant codes
        if (str->_flag & _IOCOMMIT) {
                return (_commit(_fileno(str)) ? EOF : 0);
        }
        return 0;
    }

so here is a _IOCOMMIT flag, then see ...\src\fdopen.c

    FILE * __cdecl _tfdopen (int filedes, const _TSCHAR *mode) {
      //irrelevant codes
        while(*++mode && whileflag)
          switch(*mode) {
      //...
              case _T('c'):
                if (cnflag)
                    whileflag = 0;
                else {
                    cnflag = 1;
                    fileflag |= _IOCOMMIT;
                }
               break;
     //...
    }

_tfopen is called by fopen internally, refer to the documents of fopen, I find this:

" mode: 'c'

Enable the commit flag for the associated filename so that the contents of the file buffer are written directly to disk if either fflush or _flushall is called." So, _commit is called only if 'c' flag is set when calling fopen.

the _commit function eventually calls FlushFileBuffers.

Besides these, I find that when I write only a few data to file(not exceeds the buffer size), if fwrite without fflush, the text will not be wrritten apparently, while for API, after WriteFile even if I don't call FlushFileBuffers, when I open the file(program is in Sleep), the content is written to file automatically, that was one of the reasons why I was confused about flush , this operation may be done by OS, WriteFile copies data to system cache, and its file buffer is managed by OS, so it's resonable that fflush() only calls WriteFile internally without real flushing, system knows when to flush them, maybe when file handle is closed or when another I/O access to this file is occured. So I modified the benchmark as this:

      start_time = timeGetTime();
for(int i = 0; i < times; ++i) {
    fwrite(text, 1, size, cfile);
    fflush(cfile);
}
end_time = timeGetTime();
std::cout << end_time - start_time << '\n';

start_time = timeGetTime();
for(int i = 0; i < times; ++i) {
    WriteFile(wfile, text, size, &written, NULL);
}
end_time = timeGetTime();
std::cout << end_time - start_time << std::endl;

the result is times:99999 fwrite:217 WriteFile:171

So, in conclusion, to speed up API file writting operation:

  1. Do not call FlushFileBuffers explicitly, data in system cache will be flushed to disk when needed.

  2. Get a buffer for WriteFile, just as fwrite does, because API call cost more time than simply memcpy, call WriteFile when buffer is filled up.

like image 377
Frahm Avatar asked Jan 12 '13 04:01

Frahm


1 Answers

Using a tool like Process Monitor (procmon) from Sysinternals, you'll see that the call to fflush() isn't doing the same thing as FlushFileBuffers(wfile) (or the FILE_FLAG_WRITE_THROUGH flag to CreateFile()).

fwrite() will write the data to a buffer until that buffer fills, which will cause it to send the data in the buffer to WriteFile() call. When you call fflush() all that happens is that the data currently in the buffer is passed to a call to WriteFile() - fflush() doesn't call FlushFileBuffers():

1:21:32.9391534 AM  test.exe    6132    WriteFile   C:\temp\file1.txt   SUCCESS Offset: 0, Length: 24
1:21:32.9392200 AM  test.exe    6132    WriteFile   C:\temp\file1.txt   SUCCESS Offset: 24, Length: 24
1:21:32.9392340 AM  test.exe    6132    WriteFile   C:\temp\file1.txt   SUCCESS Offset: 48, Length: 24
1:21:32.9392436 AM  test.exe    6132    WriteFile   C:\temp\file1.txt   SUCCESS Offset: 72, Length: 24
1:21:32.9392526 AM  test.exe    6132    WriteFile   C:\temp\file1.txt   SUCCESS Offset: 96, Length: 24
1:21:32.9392623 AM  test.exe    6132    WriteFile   C:\temp\file1.txt   SUCCESS Offset: 120, Length: 24

For comparison, here's an example of a trace from the fwrite() loop without the fflush() call:

1:27:28.5675034 AM  test.exe    3140    WriteFile   C:\temp\file1.txt   SUCCESS Offset: 0, Length: 1,024
1:27:28.5676098 AM  test.exe    3140    WriteFile   C:\temp\file1.txt   SUCCESS Offset: 1,024, Length: 1,024
1:27:28.5676399 AM  test.exe    3140    WriteFile   C:\temp\file1.txt   SUCCESS Offset: 2,048, Length: 1,024
1:27:28.5676651 AM  test.exe    3140    WriteFile   C:\temp\file1.txt   SUCCESS Offset: 3,072, Length: 1,024

And here's a snippet of the trace from the WriteFile() loop (with the FILE_ATTRIBUTE_NORMAL flag and explicit call to FlushFileBuffers() - it just makes what's happening easier to see in the trace since the FlushFileBuffers() call is shown in the trace instead of just showing as a second 4KB WriteFile() call).

1:21:29.0068503 AM  test.exe    6132    WriteFile   C:\temp\file2.txt   SUCCESS Offset: 0, Length: 24, Priority: Normal
1:21:29.0069197 AM  test.exe    6132    FlushBuffersFile    C:\temp\file2.txt   SUCCESS 
1:21:29.0069517 AM  test.exe    6132    WriteFile   C:\temp\file2.txt   SUCCESS Offset: 0, Length: 4,096, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal
1:21:29.0087574 AM  test.exe    6132    WriteFile   C:\temp\file2.txt   SUCCESS Offset: 24, Length: 24
1:21:29.0087798 AM  test.exe    6132    FlushBuffersFile    C:\temp\file2.txt   SUCCESS 
1:21:29.0088087 AM  test.exe    6132    WriteFile   C:\temp\file2.txt   SUCCESS Offset: 0, Length: 4,096, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal
1:21:29.0102260 AM  test.exe    6132    WriteFile   C:\temp\file2.txt   SUCCESS Offset: 48, Length: 24
1:21:29.0102428 AM  test.exe    6132    FlushBuffersFile    C:\temp\file2.txt   SUCCESS 
1:21:29.0102701 AM  test.exe    6132    WriteFile   C:\temp\file2.txt   SUCCESS Offset: 0, Length: 4,096, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal
1:21:29.0113444 AM  test.exe    6132    WriteFile   C:\temp\file2.txt   SUCCESS Offset: 72, Length: 24
1:21:29.0113602 AM  test.exe    6132    FlushBuffersFile    C:\temp\file2.txt   SUCCESS 
1:21:29.0113848 AM  test.exe    6132    WriteFile   C:\temp\file2.txt   SUCCESS Offset: 0, Length: 4,096, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal

So the reason that your benchmark shows a severe disadvantage to the WriteFile() loop is simply because you have about a thousand calls to FlushFileBuffers() that aren't in the fwrite() loop.

like image 101
Michael Burr Avatar answered Nov 16 '22 00:11

Michael Burr