Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The fastest way to write data while producing it

In my program I am simulating a N-body system for a large number of iterations. For each iteration I produce a set of 6N coordinates which I need to append to a file and then use for executing the next iteration. The code is written in C++ and currently makes use of ofstream's method write() to write the data in binary format at each iteration.

I am not an expert in this field, but I would like to improve this part of the program, since I am in the process of optimizing the whole code. I feel that the latency associated with writing the result of the computation at each cycle significantly slows down the performance of the software.

I'm confused because I have no experience in actual parallel programming and low level file I/O. I thought of some abstract techniques that I imagined I could implement, since I am programming for modern (possibly multi-core) machines with Unix OSes:

  • Writing the data in the file in chunks of n iterations (there seem to be better ways to proceed...)
  • Parallelizing the code with OpenMP (how to actually implement a buffer so that the threads are synchronized appropriately, and do not overlap?)
  • Using mmap (the file size could be huge, on the order of GBs, is this approach robust enough?)

However, I don't know how to best implement them and combine them appropriately.

like image 534
Fiat Lux Avatar asked Jan 03 '12 16:01

Fiat Lux


2 Answers

Of course writing into a file at each iteration is inefficient and most likely slow down your computing. (as a rule of thumb, depends on your actuel case)

You have to use a producer -> consumer design pattern. They will be linked by a queue, like a conveyor belt.

  • The producer will try to produce as fast as it can, only slowing if the consumer can't handle it.
  • The consumer will try to "consume" as fast as it can.

By splitting the two, you can increase performance more easily because each process is simpler and has less interferences from the other.

  • If the producer is faster, you need to improve the consumer, in your case by writing into file in the most efficient way, chunk by chunk most likely (as you said)
  • If the consumer is faster, you need to improve the producer, most likely by parallelizing it as you said.

There is no need to optimize both. Only optimize the slowest (the bottleneck).

Practically, you use threads and a synchronized queue between them. For implementation hints, have a look here, especially §18.12 "The Producer-Consumer Pattern".

About flow management, you'll have to add a little bit more complexity by selecting a "max queue size" and making the producer(s) wait if the queue has not enough space. Beware of deadlocks then, code it carefully. (see the wikipedia link I gave about that)

Note : It's a good idea to use boost threads because threads are not very portable. (well, they are since C++0x but C++0x availability is not yet good)

like image 130
Offirmo Avatar answered Nov 08 '22 19:11

Offirmo


It's better to split operation into two independent processes: data-producing and file-writing. Data-producing would use some buffer for iteration-wise data passing, and file-writing would use a queue to store write requests. Then, data-producing would just post a write request and go on, while file-writing would cope with the writing in the background.

Essentially, if the data is produced much faster than it can possibly be stored, you'll quickly end up holding most of it in the buffer. In that case your actual approach seems to be quite reasonable as is, since little can be done programmatically then to improve the situation.

like image 1
vines Avatar answered Nov 08 '22 18:11

vines