Consider the following program:
#define _FILE_OFFSET_BITS 64 // Allow large files.
#define REVISION "POSIX Revision #9"
#include <iostream>
#include <cstdio>
#include <ctime>
const int block_size = 1024 * 1024;
const char block[block_size] = {};
int main()
{
std::cout << REVISION << std::endl;
std::time_t t0 = time(NULL);
std::cout << "Open: 'BigFile.bin'" << std::endl;
FILE * file;
file = fopen("BigFile.bin", "wb");
if (file != NULL)
{
std::cout << "Opened. Writing..." << std::endl;
for (int n=0; n<4096; n++)
{
size_t written = fwrite(block, 1, block_size, file);
if (written != block_size)
{
std::cout << "Write error." << std::endl;
return 1;
}
}
fclose(file);
std::cout << "Success." << std::endl;
time_t t1 = time(NULL);
if (t0 == ((time_t)-1) || t1 == ((time_t)-1))
{
std::cout << "Clock error." << std::endl;
return 2;
}
double ticks = (double)(t1 - t0);
std::cout << "Seconds: " << ticks << std::endl;
file = fopen("BigFile.log", "w");
fprintf(file, REVISION);
fprintf(file, " Seconds: %f\n", ticks);
fclose(file);
return 0;
}
std::cout << "Something went wrong." << std::endl;
return 1;
}
It simply writes 4GB of zeros to a file on disk and times how long it took.
Under Linux, this takes 148 seconds on average. Under Windows, on the same PC, it takes on average 247 seconds.
What the hell am I doing wrong?!
The code is compiled under GCC for Linux, and Visual Studio for Windows, but I cannot imagine a universe in which the compiler used should make any measurable difference to a pure I/O benchmark. The filesystem used in all cases is NTFS.
I just don't understand why such a vast performance difference exists. I don't know why Windows is running so slow. How do I force Windows to run at the full speed that the disk is clearly capable of?
(The numbers above are for OpenSUSE 13.1 32-bit and Windows XP 32-bit on an old Dell laptop. But I've observed similar speed differences on several PCs around the office, running various versions of Windows.)
Edit: The executable and the file it writes both reside on an external USB harddisk which is formatted as NTFS and is nearly completely empty. Fragmentation is almost certainly not a problem. It could be some kind of driver issue, but I've seen the same performance difference on several other systems running different versions of Windows. There is no antivirus installed.
Just for giggles, I tried changing it to use the Win32 API directly. (Obviously this only works for Windows.) Time becomes a little more erratic, but still within a few percent of what it was before. Unless I specify FILE_FLAG_WRITE_THROUGH
; then it goes significantly slower. A few other flags make it slower, but I can't find the one that makes it go faster...
You need to sync file contents to disk, otherwise you are just measuring the level of caching being performed by the operating system.
Call fsync
before you close the file.
If you don't do this, the majority of execution time is most likely spent waiting for cache to be flushed so that new data can be stored in it, but certainly a portion of the data you write will not be written out to disk by the time you close the file. The difference in execution times, then, is probably due to linux caching more of the writes before it runs out of available cache space. By contrast, if you call fsync
before closing the file, all the written data should be flushed to disk before your time measurement takes place.
I suspect if you add an fsync
call, the execution time on the two systems won't differ by so much.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With