I need to write a program which will write many characters in a output file.
My program will also need to write newline for better formatting.
I understand ofstream
is a buffered stream and if we use a buffered stream for file io, we gain performance. However, if we use std::endl
the output will be flushed and we will loose any potential performance gain due to the buffered output.
I suppose if I use '\n'
for new line the output will be only flushed when we will std::endl
. Is this correct? And are there any tricks that can be used to get performance gain during file output?
Note: I want to flush the buffered output at the completion of the file write operations. I think in this way I can minimize file I/O and thus can gain performance.
Generally, the user of stream classes shouldn't mess with the stream's flushing if maximum performance is wanted: the streams internally flush their buffer when it is full. This is actually more efficient than waiting until all output is ready, especially with large files: the buffered data is written while it is still likely to be in memory. If you create a huge buffer and only write it once the virtual memory system will have put parts of the data onto disc but not the file. It would need to be read from disc and written again.
The main point with respect to std::endl
is that people abuse it a line ending which causes the buffer to flush and they are unaware of the performance implications. The intention of std::endl
is that people are given control to flush files at reasonable points. For this to be effective they need to know what they are doing. Sadly, there were too many people ignorant of what std::endl
does who advertised its use as a line ending such that it is used in many places where it is plain wrong.
That said, below are a number of things you might want to try to improve performance. I assume you need formatted output (which the use of std::ofstream::write()
won't give you).
std::endl
unless you have to. If the writing code already exists and uses std::endl
in many places, some of which possibly outside your control, you can use a filtering stream buffer which uses its internal buffer of reasonable size and which doesn't forward calls to its sync()
function to the underlying stream buffer. Although this involves an extra copy, this is better than some spurious flushes as these are orders of magnitude more expensive.std::ofstream
s, calling std::ios_base::sync_with_stdio(false)
used to affect the performance on some implementations. You'd want to look at using a different IOstream implementation if this has an effect because there are probably more things wrong with respect to performance.std::locale
whose std::codecvt<...>
returns true
when calling its always_noconv()
. This can easily be checked by using std::use_facet<std::codecvt<char, char, stdd::mbstate_t> >(out.get_loc()).always_noconv()
. You can use std::locale("C")
to get hold of an std::locale
for which this should be true.std::num_put<char>
facet may still do things you don't really need. Especially if your numeric formatting is reasonably simple, i.e. you don't keep changing formatting flags, you haven't replace mapping of characters (i.e. you don't use a funny std::ctype<char>
facet), etc. it may be reasonable to use a custom std::num_put<char>
facet: It is fairly easy to create a fast but simple formatting function for integer types and a good formatting function for floating points which doesn't use snprintf()
internally.Some people have suggested the use of memory mapped files but this only works reasonable when the size of the target file is known in advance. If this is the case this is a great way to also improve performance otherwise it isn't worth the bother. Note that you can use the stream formatting with memory mapped files (or, more generally, with any kind of output interface) by creating a custom std::streambuf
which uses the memory mapping interface. I found memory mapping sometimes effective when using them with std::istream
s. In many cases the differences don't really matter much.
A long time ago I wrote my own IOStreams and locales implementation which doesn't suffer from some of the performance problems mentioned above (it is available from my site but it is a bit stale and I haven't touched it for nearly 10 years now). There are lots of things which can be improved over this implementation still but I haven't an up to date implementation which I'd be ready to post somewhere. Soon, hopefully - something I keep thinking since nearly 10 years, though...
Printing a \n
will not (necessarily) flush the output, while printing std::endl
or std::flush
will.
If you want fast writing and don't care if the data is there until you're completely done, then do all of your writing with \n
and don't worry about it (since closing the file will also flush the stream).
If you're still not getting the performance you want, you could use fstream::read(char*, int) -- it lets you read data in whatever size blocks you want (try bigger blocks and see if it helps).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With