I have a text file (~10GB) with the following format:
data1<TAB>data2<TAB>data3<TAB>data4<NEWLINE>
I want to scan through it and do processing only on data2. What is the best (fastest) way to extract data2 in C++.
EDIT: Added NEWLINE
Read the file line by line. For each line, split on the tab. That will leave you with an array containing the fields, allowing you to work with the second field (data2).
This sounds like a job for a higher level tool like shell utilities:
cut -f2 # from stdin
cut -f2 <my_file # from file
But nonetheless, you can do that with C++ as well:
void parse(std::istream& in)
{
std::string word;
while( in ) {
std::cin >> word; // throwaway 1
std::cin >> word; // data2
process(word);
std::cin >> word >> word; // throwaway 3 and 4
}
}
// ...
parse(std::cin);
std::ifstream file("my_file");
parse(file);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With