I have a file of the following format:
1: some_basic_info_in_this_line
2: LOTS_OF_INFO_IN_THIS_LINE_HUNDREDS_OF_CHARS
3: some_basic_info_in_this_line
4: LOTS_OF_INFO_IN_THIS_LINE_HUNDREDS_OF_CHARS
...
That format repeats itself tens of thousands of times, making files up to 50 GiB+. I need an efficient way to process the only the line 2 of this format. I'm open to using C, C++11 STL, or boost. I've looked at various other questions regarding file streaming on SO, but I feel like my situation is unique because of the large file size and only needing one out of every four lines.
Memory mapping the file seems to be the most efficient from what I've read, but mapping a 50+ GB file will eat up most computers RAM (you can assume that this application will be used by "average" users - say 4-8 GiB RAM). Also I will only need to process one of the lines at a time. Here is how I am currently doing this (yes I'm aware this is not efficient, that's why I'm redesigning it):
std::string GL::getRead(ifstream& input)
{
std::string str;
std::string toss;
if (input.good())
{
getline(input, toss);
getline(input, str);
getline(input, toss);
getline(input, toss);
}
return str;
}
Is breaking the mmap into blocks the answer for my situation? Is there anyway that I can leverage only needing 1 out of 4 lines? Thanks for the help.
Use ignore
instead of getline
:
std::string GL::getRead(ifstream& input)
{
std::string str;
if (!input.fail())
{
input.ignore(LARGE_NUMBER, '\n');
getline(input, str);
input.ignore(LARGE_NUMBER, '\n');
input.ignore(LARGE_NUMBER, '\n');
}
return str;
}
LARGE_NUMBER could be std::numeric_limits<std::streamsize>::max()
if you don't have a good reason to have a smaller number (think of DOS attacks)
TIP Consider passing
str
by reference. By reading into the same string each time, you can avoid a lot of allocations, which are typically the number 1 reason your program runs slow.TIP Consider using a memoery mapped file (Boost Iostreams, Boost Interpocess, or
mmap(1)
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With