Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improving/optimizing file write speed in C++

I've been running into some issues with writing to a file - namely, not being able to write fast enough.

To explain, my goal is to capture a stream of data coming in over gigabit Ethernet and simply save it to a file.

The raw data is coming in at a rate of 10MS/s, and it's then saved to a buffer and subsequently written to a file.

Below is the relevant section of code:

    std::string path = "Stream/raw.dat";
    ofstream outFile(path, ios::out | ios::app| ios::binary);

    if(outFile.is_open())
        cout << "Yes" << endl;

    while(1)
    {
         rxSamples = rxStream->recv(&rxBuffer[0], rxBuffer.size(), metaData);
         switch(metaData.error_code)
         {

             //Irrelevant error checking...

             //Write data to a file
                std::copy(begin(rxBuffer), end(rxBuffer), std::ostream_iterator<complex<float>>(outFile));
         }
    } 

The issue I'm encountering is that it's taking too long to write the samples to a file. After a second or so, the device sending the samples reports its buffer has overflowed. After some quick profiling of the code, nearly all of the execution time is spent on std::copy(...) (99.96% of the time to be exact). If I remove this line, I can run the program for hours without encountering any overflow.

That said, I'm rather stumped as to how I can improve the write speed. I've looked through several posts on this site, and it seems like the most common suggestion (in regard to speed) is to implement file writes as I've already done - through the use of std::copy.

If it's helpful, I'm running this program on Ubuntu x86_64. Any suggestions would be appreciated.

like image 820
Mlagma Avatar asked Aug 05 '15 17:08

Mlagma


People also ask

How do you speed up a loop in C++?

For the access to a an element you will have to do the index calculation yourself. But as one of the indices (j) is constant over the loop, you can compute the start element's index before the loop and inside just increment the index 1. That might get you some significant improvement.


2 Answers

So the main problem here is that you try to write in the same thread as you receive, which means that your recv() can only be called again after copy is complete. A few observations:

  • Move the writing to a different thread. This is about a USRP, so GNU Radio might really be the tool of your choice -- it's inherently multithreaded.
  • Your output iterator is probably not the most performant solution. Simply "write()" to a file descriptor might be better, but that's performance measurements that are up to you
  • If your hard drive/file system/OS/CPU aren't up to the rates coming in from the USRP, even if decoupling receiving from writing thread-wise, then there's nothing you can do -- get a faster system.
  • Try writing to a RAM disk instead

In fact, I don't know how you came up with the std::copy approach. The rx_samples_to_file example that comes with UHD does this with a simple write, and you should definitely favor that over copying; file I/O can, on good OSes, often be done with one copy less, and iterating over all elements is probably very slow.

like image 167
Marcus Müller Avatar answered Oct 23 '22 22:10

Marcus Müller


Let's do a bit of math.

Your samples are (apparently) of type std::complex<std::float>. Given a (typical) 32-bit float, that means each sample is 64 bits. At 10 MS/s, that means the raw data is around 80 megabytes per second--that's within what you can expect to write to a desktop (7200 RPM) hard drive, but getting fairly close to the limit (which is typically around 100-100 megabytes per second or so).

Unfortunately, despite the std::ios::binary, you're actually writing the data in text format (because std::ostream_iterator basically does stream << data;).

This not only loses some precision, but increases the size of the data, at least as a rule. The exact amount of increase depends on the data--a small integer value can actually decrease the quantity of data, but for arbitrary input, a size increase close to 2:1 is fairly common. With a 2:1 increase, your outgoing data is now around 160 megabytes/second--which is faster than most hard drives can handle.

The obvious starting point for an improvement would be to write the data in binary format instead:

uint32_t nItems = std::end(rxBuffer)-std::begin(rxBuffer);
outFile.write((char *)&nItems, sizeof(nItems));
outFile.write((char *)&rxBuffer[0], sizeof(rxBuffer));

For the moment I've used sizeof(rxBuffer) on the assumption that it's a real array. If it's actually a pointer or vector, you'll have to compute the correct size (what you want is the total number of bytes to be written).

I'd also note that as it stands right now, your code has an even more serious problem: since it hasn't specified a separator between elements when it writes the data, the data will be written without anything to separate one item from the next. That means if you wrote two values of (for example) 1 and 0.2, what you'd read back in would not be 1 and 0.2, but a single value of 10.2. Adding separators to your text output will add yet more overhead (figure around 15% more data) to a process that's already failing because it generates too much data.

Writing in binary format means each float will consume precisely 4 bytes, so delimiters are not necessary to read the data back in correctly.

The next step after that would be to descend to a lower-level file I/O routine. Depending on the situation, this might or might not make much difference. On Windows, you can specify FILE_FLAG_NO_BUFFERING when you open a file with CreateFile. This means that reads and writes to that file will basically bypass the cache and go directly to the disk.

In your case, that's probably a win--at 10 MS/s, you're probably going to use up the cache space quite a while before you reread the same data. In such a case, letting the data go into the cache gains you virtually nothing, but costs you some data to copy data to the cache, then somewhat later copy it out to the disk. Worse, it's likely to pollute the cache with all this data, so it's no longer storing other data that's a lot more likely to benefit from caching.

like image 32
Jerry Coffin Avatar answered Oct 23 '22 21:10

Jerry Coffin