Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

c++ fastest way to read only last line of text file?

Tags:

c++

iostream

seek

I would like to read only the last line of a text file (I'm on UNIX, can use Boost). All the methods I know require scanning through the entire file to get the last line which is not efficient at all. Is there an efficient way to get only the last line?

Also, I need this to be robust enough that it works even if the text file in question is constantly being appended to by another process.

like image 338
user788171 Avatar asked Aug 09 '12 03:08

user788171


2 Answers

Use seekg to jump to the end of the file, then read back until you find the first newline. Below is some sample code off the top of my head using MSVC.

#include <iostream>
#include <fstream>
#include <sstream>

using namespace std;

int main()
{
    string filename = "test.txt";
    ifstream fin;
    fin.open(filename);
    if(fin.is_open()) {
        fin.seekg(-1,ios_base::end);                // go to one spot before the EOF

        bool keepLooping = true;
        while(keepLooping) {
            char ch;
            fin.get(ch);                            // Get current byte's data

            if((int)fin.tellg() <= 1) {             // If the data was at or before the 0th byte
                fin.seekg(0);                       // The first line is the last line
                keepLooping = false;                // So stop there
            }
            else if(ch == '\n') {                   // If the data was a newline
                keepLooping = false;                // Stop at the current position.
            }
            else {                                  // If the data was neither a newline nor at the 0 byte
                fin.seekg(-2,ios_base::cur);        // Move to the front of that data, then to the front of the data before it
            }
        }

        string lastLine;            
        getline(fin,lastLine);                      // Read the current line
        cout << "Result: " << lastLine << '\n';     // Display it

        fin.close();
    }

    return 0;
}

And below is a test file. It succeeds with empty, one-line, and multi-line data in the text file.

This is the first line.
Some stuff.
Some stuff.
Some stuff.
This is the last line.
like image 97
derpface Avatar answered Oct 10 '22 12:10

derpface


While the answer by derpface is definitely correct, it often returns unexpected results. The reason for this is that, at least on my operating system (Mac OSX 10.9.5), many text editors terminate their files with an 'end line' character.

For example, when I open vim, type just the single character 'a' (no return), and save, the file will now contain (in hex):

61 0A

Where 61 is the letter 'a' and 0A is an end of line character.

This means that the code by derpface will return an empty string on all files created by such a text editor.

While I can certainly imagine cases where a file terminated with an 'end line' should return the empty string, I think ignoring the last 'end line' character would be more appropriate when dealing with regular text files; if the file is terminated by an 'end line' character we properly ignore it, and if the file is not terminated by an 'end line' character we don't need to check it.

My code for ignoring the last character of the input file is:

#include <iostream>
#include <string>
#include <fstream>
#include <iomanip>

int main() {
    std::string result = "";
    std::ifstream fin("test.txt");

    if(fin.is_open()) {
        fin.seekg(0,std::ios_base::end);      //Start at end of file
        char ch = ' ';                        //Init ch not equal to '\n'
        while(ch != '\n'){
            fin.seekg(-2,std::ios_base::cur); //Two steps back, this means we
                                              //will NOT check the last character
            if((int)fin.tellg() <= 0){        //If passed the start of the file,
                fin.seekg(0);                 //this is the start of the line
                break;
            }
            fin.get(ch);                      //Check the next character
        }

        std::getline(fin,result);
        fin.close();

        std::cout << "final line length: " << result.size() <<std::endl;
        std::cout << "final line character codes: ";
        for(size_t i =0; i<result.size(); i++){
            std::cout << std::hex << (int)result[i] << " ";
        }
        std::cout << std::endl;
        std::cout << "final line: " << result <<std::endl;
    }

    return 0;
}

Which will output:

final line length: 1
final line character codes: 61 
final line: a

On the single 'a' file.

EDIT: The line if((int)fin.tellg() <= 0){ actually causes problems if the file is too large (> 2GB), because tellg does not just return the number of characters from the start of the file (tellg() function give wrong size of file?). It may be better to separately test for the start of the file fin.tellg()==tellgValueForStartOfFile and for errors fin.tellg()==-1. The tellgValueForStartOfFile is probably 0, but a better way of making sure would probably be:

fin.seekg (0, is.beg);
tellgValueForStartOfFile = fin.tellg();
like image 38
Joost Huizinga Avatar answered Oct 10 '22 12:10

Joost Huizinga