Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write a large buffer into a binary file in C++, fast?

I'm trying to write huge amounts of data onto my SSD(solid state drive). And by huge amounts I mean 80GB.

I browsed the web for solutions, but the best I came up with was this:

#include <fstream>
const unsigned long long size = 64ULL*1024ULL*1024ULL;
unsigned long long a[size];
int main()
{
    std::fstream myfile;
    myfile = std::fstream("file.binary", std::ios::out | std::ios::binary);
    //Here would be some error handling
    for(int i = 0; i < 32; ++i){
        //Some calculations to fill a[]
        myfile.write((char*)&a,size*sizeof(unsigned long long));
    }
    myfile.close();
}

Compiled with Visual Studio 2010 and full optimizations and run under Windows7 this program maxes out around 20MB/s. What really bothers me is that Windows can copy files from an other SSD to this SSD at somewhere between 150MB/s and 200MB/s. So at least 7 times faster. That's why I think I should be able to go faster.

Any ideas how I can speed up my writing?

like image 200
Dominic Hofer Avatar asked Jul 19 '12 15:07

Dominic Hofer


3 Answers

This did the job (in the year 2012):

#include <stdio.h>
const unsigned long long size = 8ULL*1024ULL*1024ULL;
unsigned long long a[size];

int main()
{
    FILE* pFile;
    pFile = fopen("file.binary", "wb");
    for (unsigned long long j = 0; j < 1024; ++j){
        //Some calculations to fill a[]
        fwrite(a, 1, size*sizeof(unsigned long long), pFile);
    }
    fclose(pFile);
    return 0;
}

I just timed 8GB in 36sec, which is about 220MB/s and I think that maxes out my SSD. Also worth to note, the code in the question used one core 100%, whereas this code only uses 2-5%.

Thanks a lot to everyone.

Update: 5 years have passed it's 2017 now. Compilers, hardware, libraries and my requirements have changed. That's why I made some changes to the code and did some new measurements.

First up the code:

#include <fstream>
#include <chrono>
#include <vector>
#include <cstdint>
#include <numeric>
#include <random>
#include <algorithm>
#include <iostream>
#include <cassert>

std::vector<uint64_t> GenerateData(std::size_t bytes)
{
    assert(bytes % sizeof(uint64_t) == 0);
    std::vector<uint64_t> data(bytes / sizeof(uint64_t));
    std::iota(data.begin(), data.end(), 0);
    std::shuffle(data.begin(), data.end(), std::mt19937{ std::random_device{}() });
    return data;
}

long long option_1(std::size_t bytes)
{
    std::vector<uint64_t> data = GenerateData(bytes);

    auto startTime = std::chrono::high_resolution_clock::now();
    auto myfile = std::fstream("file.binary", std::ios::out | std::ios::binary);
    myfile.write((char*)&data[0], bytes);
    myfile.close();
    auto endTime = std::chrono::high_resolution_clock::now();

    return std::chrono::duration_cast<std::chrono::milliseconds>(endTime - startTime).count();
}

long long option_2(std::size_t bytes)
{
    std::vector<uint64_t> data = GenerateData(bytes);

    auto startTime = std::chrono::high_resolution_clock::now();
    FILE* file = fopen("file.binary", "wb");
    fwrite(&data[0], 1, bytes, file);
    fclose(file);
    auto endTime = std::chrono::high_resolution_clock::now();

    return std::chrono::duration_cast<std::chrono::milliseconds>(endTime - startTime).count();
}

long long option_3(std::size_t bytes)
{
    std::vector<uint64_t> data = GenerateData(bytes);

    std::ios_base::sync_with_stdio(false);
    auto startTime = std::chrono::high_resolution_clock::now();
    auto myfile = std::fstream("file.binary", std::ios::out | std::ios::binary);
    myfile.write((char*)&data[0], bytes);
    myfile.close();
    auto endTime = std::chrono::high_resolution_clock::now();

    return std::chrono::duration_cast<std::chrono::milliseconds>(endTime - startTime).count();
}

int main()
{
    const std::size_t kB = 1024;
    const std::size_t MB = 1024 * kB;
    const std::size_t GB = 1024 * MB;

    for (std::size_t size = 1 * MB; size <= 4 * GB; size *= 2) std::cout << "option1, " << size / MB << "MB: " << option_1(size) << "ms" << std::endl;
    for (std::size_t size = 1 * MB; size <= 4 * GB; size *= 2) std::cout << "option2, " << size / MB << "MB: " << option_2(size) << "ms" << std::endl;
    for (std::size_t size = 1 * MB; size <= 4 * GB; size *= 2) std::cout << "option3, " << size / MB << "MB: " << option_3(size) << "ms" << std::endl;

    return 0;
}

This code compiles with Visual Studio 2017 and g++ 7.2.0 (a new requirements). I ran the code with two setups:

  • Laptop, Core i7, SSD, Ubuntu 16.04, g++ Version 7.2.0 with -std=c++11 -march=native -O3
  • Desktop, Core i7, SSD, Windows 10, Visual Studio 2017 Version 15.3.1 with /Ox /Ob2 /Oi /Ot /GT /GL /Gy

Which gave the following measurements (after ditching the values for 1MB, because they were obvious outliers): enter image description here enter image description here Both times option1 and option3 max out my SSD. I didn't expect this to see, because option2 used to be the fastest code on my old machine back then.

TL;DR: My measurements indicate to use std::fstream over FILE.

like image 169
Dominic Hofer Avatar answered Nov 01 '22 00:11

Dominic Hofer


Try the following, in order:

  • Smaller buffer size. Writing ~2 MiB at a time might be a good start. On my last laptop, ~512 KiB was the sweet spot, but I haven't tested on my SSD yet.

    Note: I've noticed that very large buffers tend to decrease performance. I've noticed speed losses with using 16-MiB buffers instead of 512-KiB buffers before.

  • Use _open (or _topen if you want to be Windows-correct) to open the file, then use _write. This will probably avoid a lot of buffering, but it's not certain to.

  • Using Windows-specific functions like CreateFile and WriteFile. That will avoid any buffering in the standard library.

like image 26
user541686 Avatar answered Nov 01 '22 00:11

user541686


I see no difference between std::stream/FILE/device. Between buffering and non buffering.

Also note:

  • SSD drives "tend" to slow down (lower transfer rates) as they fill up.
  • SSD drives "tend" to slow down (lower transfer rates) as they get older (because of non working bits).

I am seeing the code run in 63 secondds.
Thus a transfer rate of: 260M/s (my SSD look slightly faster than yours).

64 * 1024 * 1024 * 8 /*sizeof(unsigned long long) */ * 32 /*Chunks*/

= 16G
= 16G/63 = 260M/s

I get a no increase by moving to FILE* from std::fstream.

#include <stdio.h>

using namespace std;

int main()
{
    
    FILE* stream = fopen("binary", "w");

    for(int loop=0;loop < 32;++loop)
    {
         fwrite(a, sizeof(unsigned long long), size, stream);
    }
    fclose(stream);

}

So the C++ stream are working as fast as the underlying library will allow.

But I think it is unfair comparing the OS to an application that is built on-top of the OS. The application can make no assumptions (it does not know the drives are SSD) and thus uses the file mechanisms of the OS for transfer.

While the OS does not need to make any assumptions. It can tell the types of the drives involved and use the optimal technique for transferring the data. In this case a direct memory to memory transfer. Try writing a program that copies 80G from 1 location in memory to another and see how fast that is.

Edit

I changed my code to use the lower level calls:
ie no buffering.

#include <fcntl.h>
#include <unistd.h>


const unsigned long long size = 64ULL*1024ULL*1024ULL;
unsigned long long a[size];
int main()
{
    int data = open("test", O_WRONLY | O_CREAT, 0777);
    for(int loop = 0; loop < 32; ++loop)
    {   
        write(data, a, size * sizeof(unsigned long long));
    }   
    close(data);
}

This made no diffference.

NOTE: My drive is an SSD drive if you have a normal drive you may see a difference between the two techniques above. But as I expected non buffering and buffering (when writting large chunks greater than buffer size) make no difference.

Edit 2:

Have you tried the fastest method of copying files in C++

int main()
{
    std::ifstream  input("input");
    std::ofstream  output("ouptut");

    output << input.rdbuf();
}
like image 23
Martin York Avatar answered Oct 31 '22 22:10

Martin York