Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is my C++ disk write test much slower than a simply file copy using bash?

Using below program I try to test how fast I can write to disk using std::ofstream.

I achieve around 300 MiB/s when writing a 1 GiB file.

However, a simple file copy using the cp command is at least twice as fast.

Is my program hitting the hardware limit or can it be made faster?

#include <chrono>
#include <iostream>
#include <fstream>

char payload[1000 * 1000]; // 1 MB

void test(int MB)
{
    // Configure buffer
    char buffer[32 * 1000];
    std::ofstream of("test.file");
    of.rdbuf()->pubsetbuf(buffer, sizeof(buffer));

    auto start_time = std::chrono::steady_clock::now();

    // Write a total of 1 GB
    for (auto i = 0; i != MB; ++i)
    {
        of.write(payload, sizeof(payload));
    }

    double elapsed_ns = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::steady_clock::now() - start_time).count();
    double megabytes_per_ns = 1e3 / elapsed_ns;
    double megabytes_per_s = 1e9 * megabytes_per_ns;
    std::cout << "Payload=" << MB << "MB Speed=" << megabytes_per_s << "MB/s" << std::endl;
}

int main()
{
    for (auto i = 1; i <= 10; ++i)
    {
        test(i * 100);
    }
}

Output:

Payload=100MB Speed=3792.06MB/s
Payload=200MB Speed=1790.41MB/s
Payload=300MB Speed=1204.66MB/s
Payload=400MB Speed=910.37MB/s
Payload=500MB Speed=722.704MB/s
Payload=600MB Speed=579.914MB/s
Payload=700MB Speed=499.281MB/s
Payload=800MB Speed=462.131MB/s
Payload=900MB Speed=411.414MB/s
Payload=1000MB Speed=364.613MB/s

Update

I changed from std::ofstream to fwrite:

#include <chrono>
#include <cstdio>
#include <iostream>

char payload[1024 * 1024]; // 1 MiB

void test(int number_of_megabytes)
{
    FILE* file = fopen("test.file", "w");

    auto start_time = std::chrono::steady_clock::now();

    // Write a total of 1 GB
    for (auto i = 0; i != number_of_megabytes; ++i)
    {
       fwrite(payload, 1, sizeof(payload), file );
    }
    fclose(file); // TODO: RAII

    double elapsed_ns = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::steady_clock::now() - start_time).count();
    double megabytes_per_ns = 1e3 / elapsed_ns;
    double megabytes_per_s = 1e9 * megabytes_per_ns;
    std::cout << "Size=" << number_of_megabytes << "MiB Duration=" << long(0.5 + 100 * elapsed_ns/1e9)/100.0 << "s Speed=" << megabytes_per_s << "MiB/s" << std::endl;
}

int main()
{
    test(256);
    test(512);
    test(1024);
    test(1024);
}

Which improves the speed to 668MiB/s for a 1 GiB file:

Size=256MiB   Duration=0.4s   Speed=2524.66MiB/s
Size=512MiB   Duration=0.79s  Speed=1262.41MiB/s
Size=1024MiB  Duration=1.5s   Speed=664.521MiB/s
Size=1024MiB  Duration=1.5s   Speed=668.85MiB/s

Which is just as fast as dd:

time dd if=/dev/zero of=test.file bs=1024 count=0 seek=1048576

real    0m1.539s
user    0m0.001s
sys 0m0.344s
like image 344
StackedCrooked Avatar asked Mar 04 '17 07:03

StackedCrooked


3 Answers

First, you're not really measuring the disk writing speed, but (partly) the speed of writing data to the OS disk cache. To really measure the disk writing speed, the data should be flushed to disk before calculating the time. Without flushing there could be a difference depending on the file size and the available memory.

There seems to be something wrong in the calculations too. You're not using the value of MB.

Also make sure the buffer size is a power of two, or at least a multiple of the disk page size (4096 bytes): char buffer[32 * 1024];. You might as well do that for payload too. (looks like you changed that from 1024 to 1000 in an edit where you added the calculations).

Do not use streams to write a (binary) buffer of data to disk, but instead write directly to the file, using FILE*, fopen(), fwrite(), fclose(). See this answer for an example and some timings.


To copy a file: open the source file in read-only and, if possible, forward-only mode, and using fread(), fwrite():

while fread() from source to buffer
  fwrite() buffer to destination file

This should give you a speed comparable to the speed of an OS file copy (you might want to test some different buffer sizes).

This might be slightly faster using memory mapping:

open src, create memory mapping over the file
open/create dest, set file size to size of src, create memory mapping over the file
memcpy() src to dest

For large files smaller mapped views should be used.

like image 177
Danny_ds Avatar answered Nov 20 '22 15:11

Danny_ds


  1. Streams are slow
  2. cp uses syscalls directly read(2) or mmap(2).
like image 25
Fabian Klötzl Avatar answered Nov 20 '22 14:11

Fabian Klötzl


I'd wager that it's something clever inside either CP or the filesystem. If it's inside CP then it might be that the file that you are copying has a lot of 0s in it and cp is detecting this and writing a sparse version of your file. The man page for cp says "By default, sparse SOURCE files are detected by a crude heuristic and the corresponding DEST file is made sparse as well." This could mean a few things but one of them is that cp could make a sparse version of your file which would require less disk write time.

If it's within your filesystem then it might be Deduplication.

As a long-shot 3rd, it might also be something within your OS or your disk firmware that is translating the read and write into some specialized instruction that doesn't require as much synchronization as your program requires (lower bus use means less latency).

like image 4
Jacob Statnekov Avatar answered Nov 20 '22 14:11

Jacob Statnekov