Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of creating a C++ std::string from an input iterator

Tags:

c++

string

stream

I'm doing something really simple: slurping an entire text file from disk into a std::string. My current code basically does this:

std::ifstream f(filename);
return std::string(std::istreambuf_iterator<char>(f), std::istreambuf_iterator<char>());

It's very unlikely that this will ever have any kind of performance impact on the program, but I still got curious whether this is a slow way of doing it.

Is there a risk that the construction of the string will involve a lot of reallocations? Would it be better (that is, faster) to use seekg()/tellg() to calculate the size of the file and reserve() that much space in the string before doing the reading?

like image 751
CAdaker Avatar asked Feb 07 '09 21:02

CAdaker


2 Answers

I benchmarked your implementation(1), mine(2), and two others(3 and 4) that I found on stackoverflow.

Results (Average of 100 runs; timed using gettimeofday, file was 40 paragraphs of lorem ipsum):

  • readFile1: 764
  • readFile2: 104
  • readFile3: 129
  • readFile4: 402

The implementations:

string readFile1(const string &fileName)
{
    ifstream f(fileName.c_str());
    return string(std::istreambuf_iterator<char>(f),
            std::istreambuf_iterator<char>());
}

string readFile2(const string &fileName)
{
    ifstream ifs(fileName.c_str(), ios::in | ios::binary | ios::ate);

    ifstream::pos_type fileSize = ifs.tellg();
    ifs.seekg(0, ios::beg);

    vector<char> bytes(fileSize);
    ifs.read(&bytes[0], fileSize);

    return string(&bytes[0], fileSize);
}

string readFile3(const string &fileName)
{
    string data;
    ifstream in(fileName.c_str());
    getline(in, data, string::traits_type::to_char_type(
                      string::traits_type::eof()));
    return data;
}

string readFile4(const std::string& filename)
{
    ifstream file(filename.c_str(), ios::in | ios::binary | ios::ate);

    string data;
    data.reserve(file.tellg());
    file.seekg(0, ios::beg);
    data.append(istreambuf_iterator<char>(file.rdbuf()),
                istreambuf_iterator<char>());
    return data;
}
like image 188
CTT Avatar answered Nov 08 '22 21:11

CTT


What happens to the performance if you try doing that? Instead of asking "which way is faster?" you can think "hey, I can measure this."

Set up a loop that reads a file of a given size 10000 times or something, and time it. Then do it with the reserve() method and time that. Try it with a few different file sizes (from small to enormous) and see what you get.

like image 29
Greg Hewgill Avatar answered Nov 08 '22 21:11

Greg Hewgill