Efficient way to save data to disk while running a computationally intensive task

Question

I'm working on a piece of scientific software that is very cpu-intensive (its proc bound), but it needs to write data to disk fairly often (i/o bound).

I'm adding parallelization to this (OpenMP) and I'm wondering what the best way to address the write-to-disk needs. There's no reason the simulation should wait on the HDD (which is what it's doing now).

I'm looking for a 'best practice' for this, and speed is what I care about most (these can be hugely long simulations).

Thanks ~Alex

First thoughts:

having a separate process do the actual writing to disk so the simulation has two processes: one is CPU-bound (simulation) and one is IO-bound (writing file). This sounds complicated.

Possibly a pipe/buffer? I'm kind of new to these, so maybe that could be a possible solution.

Paul Sonier · Accepted Answer

I'd say the best way would be to spawn a different thread to save the data, not a completely new process; with a new process, you run the trouble of having to communicate the data to be saved across the process boundary, which introduces a new set of difficulties.

Michael Kohne · Answer

The first solution that comes to mind is pretty much what you've said - having disk writes in their own process with a one-way pipe from the sim to the writer. The writer does writes as fast as possible (drawing new data off the pipe). The problem with this is that if the sim gets too far ahead of the writer, the sim is going to be blocking on the pipe writes anyway, and it will be I/O bound at one remove.

The problem is that in fact your simulation cycle isn't complete until it's spit out the results.

The second thing that occurs to me is to use non-blocking I/O. Whenever the sim needs to write, it should do so via non-blocking I/O. On the next need to write, it can then pick up the results of the previous I/O operation (possibly incurring a small wait) before starting the new one. This keeps the simulation running as much as possible in parallel with the I/O without letting the simulation get very far ahead of the writing.

The first solution would be better if the simulation processing cycle varies (sometimes smaller than the time for a write, sometimes longer) because on average the writes might keep up with the sim.

If the processing cycle is always (or almost always) going to be shorter than the write time then you might as well not bother with the pipe and just use non-blocking I/O, because if you use the pipe it will eventually fill up and the sim will get hung up on the I/O anyway.

Vladimir Obrizan · Answer

If you implementing OpenMP to your program, then it is better to use #pragma omp single or #pragma omp master from parallel section to save to file. These pragmas allow only one thread to execute something. So, you code may look as following:

#pragma omp parallel
{
    // Calculating the first part
    Calculate();

    // Using barrier to wait all threads
    #pragma omp barrier

    #pragma omp master
    SaveFirstPartOfResults();

    // Calculate the second part
    Calculate2();

    #pragma omp barrier

    #pragma omp master
    SaveSecondPart();

    Calculate3();

    // ... and so on
}

Here team of threads will do calculation, but only single thread will save results to disk.

It looks like software pipeline. I suggest you to consider tbb::pipeline pattern from Intel Threading Building Blocks library. I may refer you to the tutorial on software pipelines at http://cache-www.intel.com/cd/00/00/30/11/301132_301132.pdf#page=25. Please read paragraph 4.2. They solved the problem: one thread to read from drive, second one to process read strings, third one to save to drive.

Nils Pipenbrinck · Answer

Since you are CPU and IO bound: Let me guess: There is still plenty of memory available, right?

If so you should buffer the data that has to be written to disk in memory to a certain extend. Writing huge chunks of data is usually a lot faster than writing small pieces.

For the writing itself: Consider using memory mapped IO. It's been a while since I've benchmarked, but last time I did it was significant faster.

Also you can always trade of CPU vs. IO a bit. I think you're currently writing the data as some kind of raw, uncompressed data, right? You may get some IO performance if you use a simple compression scheme to reduce the amount of data to be written. The ZLIB library is pretty easy to work with and compresses very fast on the lowest compression level. It depends on the nature of your data, but if there is a lot of redundancy in it even a very crude compression algorithm may eliminate the IO bound problem.

Efficient way to save data to disk while running a computationally intensive task

Tags:

performance

c

file-io

parallel-processing

openmp

machinaut

4 Answers

Paul Sonier

Michael Kohne

Vladimir Obrizan

Nils Pipenbrinck

Recent Activity

Donate For Us

Efficient way to save data to disk while running a computationally intensive task

Tags:

performance

c

file-io

parallel-processing

openmp

machinaut

4 Answers

Paul Sonier

Michael Kohne

Vladimir Obrizan

Nils Pipenbrinck

Related questions

Recent Activity

Donate For Us