How to read huge file in c++

Tags:

If I have a huge file (eg. 1TB, or any size that does not fit into RAM. The file is stored on the disk). It is delimited by space. And my RAM is only 8GB. Can I read that file in ifstream? If not, how to read a block of file (eg. 4GB)?

542

asked Jan 12 '16 19:01

ZigZagZebra

2 Answers

There are a couple of things that you can do.

First, there's no problem opening a file that is larger than the amount of RAM that you have. What you won't be able to do is copy the whole file live into your memory. The best thing would be for you to find a way to read just a few chunks at a time and process them. You can use ifstream for that purpose (with ifstream.read, for instance). Allocate, say, one megabyte of memory, read the first megabyte of that file into it, rinse and repeat:

ifstream bigFile("mybigfile.dat");
constexpr size_t bufferSize = 1024 * 1024;
unique_ptr<char[]> buffer(new char[bufferSize]);
while (bigFile)
{
    bigFile.read(buffer.get(), bufferSize);
    // process data in buffer
}

Another solution is to map the file to memory. Most operating systems will allow you to map a file to memory even if it is larger than the physical amount of memory that you have. This works because the operating system knows that each memory page associated with the file can be mapped and unmapped on-demand: when your program needs a specific page, the OS will read it from the file into your process's memory and swap out a page that hasn't been used in a while.

However, this can only work if the file is smaller than the maximum amount of memory that your process can theoretically use. This isn't an issue with a 1TB file in a 64-bit process, but it wouldn't work in a 32-bit process.

Also be aware of the spirits that you're summoning. Memory-mapping a file is not the same thing as reading from it. If the file is suddenly truncated from another program, your program is likely to crash. If you modify the data, it's possible that you will run out of memory if you can't save back to the disk. Also, your operating system's algorithm for paging in and out memory may not behave in a way that advantages you significantly. Because of these uncertainties, I would consider mapping the file only if reading it in chunks using the first solution cannot work.

On Linux/OS X, you would use mmap for it. On Windows, you would open a file and then use CreateFileMapping then MapViewOfFile.

144

answered Sep 28 '22 03:09

zneak

I am sure you don't have to keep all the file in memory. Typically one wants to read and process file by chunks. If you want to use ifstream, you can do something like that:

ifstream is("/path/to/file");
char buf[4096];
do {
    is.read(buf, sizeof(buf));
    process_chunk(buf, is.gcount());
} while(is);

answered Sep 28 '22 02:09

Oleg Andriyanov

Related questions
                            
                                Qt - Converting float to QString
                            
                                How to programmatically click a QPushButton
                            
                                Why should I ever return something by value, since C++ features const references?
                            
                                Convert four bytes to Integer using C++
                            
                                Adding integers to arrays in C++?
                            
                                Can Boost ASIO be used to build low-latency applications?
                            
                                Appending a C-array to a vector in reverse order in C++98/03 without a for-loop
                            
                                std::move doesn't work when the derived class' destructor is specified
                            
                                Can a C++ Static Library link to shared library?
                            
                                Correct way to marshal SIZE_T*?
                            
                                C++: how can I test if a number is power of ten?
                            
                                Create and use HTML full text search index (C++)
                            
                                How to print boost::any to a stream?
                            
                                Vector vs string
                            
                                Can I determine if an argument is string literal?
                            
                                C++ multithread safe local variables?
                            
                                Linking files in g++
                            
                                Using char for small integer (C++)
                            
                                How to show an 'infinite floating' progressbar in Qt without knowing the percentage?
                            
                                Most concise way to disable copying class in C++11

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to read huge file in c++

Tags:

c++

file

ifstream

ZigZagZebra

People also ask

2 Answers

zneak

Oleg Andriyanov

Recent Activity

Donate For Us