Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ High Performance File Reading and Writing (C++14)

Tags:

c++

io

I’m writing a C++14 program to load text strings from a file, do some computation on them, and write back to another file. I’m using Linux, and the files are relatively large (O(10^6 lines)). My typical approach to this is to use the old C getline and sscanf utilities to read and parse the input, and fprintf(FILE*, …) to write the output files. This works, but I’m wondering if there’s a better way with the goals of high performance and generally recommended approach with the modern C++ standard that I’m using. I’ve heard that iostream is quite slow; if that’s true, I’m wondering if there’s a more recommended approach.

Update: To clarify a bit on the use case: for each line of the input file, I'll be doing some text manipulation (data cleanup, etc.). Each line is independent. So, loading the entire input file (or, at least large chunks of it), and processing it line by line, and then writing it, seems to make the most sense. The ideal abstraction for this would be to get an iterator to the read-in buffer, with each line being an entry. Is there a recommended way to do that with std::ifstream?

like image 459
Kulluk007 Avatar asked Jul 28 '16 20:07

Kulluk007


People also ask

What is the fastest way to read a file in C?

You can read entire data files an order of magnitude or two faster this way. Allocate your array storage with malloc and then pass a pointer to that memory and the total number of bytes to be read to read. Then call read once to read the entire data file into memory.

How do you create a text file and write in C++?

Create and Write To a File To create a file, use either the ofstream or fstream class, and specify the name of the file. To write to the file, use the insertion operator ( << ).


2 Answers

The fastest option, if you have the memory to do it, is to read the entire file into a buffer with 1 read, process the buffer in memory, and write it all out again with 1 write.

Read it all:

std::string buffer;

std::ifstream f("file.txt");
f.seekg(0, std::ios::end);
buffer.resize(f.tellg());
f.seekg(0);
f.read(buffer.data(), buffer.size());

Then process it

Then write it all:

std::ofstream f("file.txt");
f.write(buffer.data(), buffer.size());
like image 50
David Avatar answered Oct 02 '22 07:10

David


If you have C++17 (std::filesystem), there is also this way (which gets the file's size through std::filesystem::file_size instead of seekg and tellg). I presume this would allow you avoid reading twice

It's shown in this answer

like image 41
Kari Avatar answered Oct 02 '22 08:10

Kari