Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to pre-allocate memory for a std::string object?

Tags:

c++

string

I need to copy a file into a string. I need someway to preallocate memory for that string object and a way to directly read the file content into that string's memory?

like image 733
Ramadheer Singh Avatar asked Jul 21 '10 20:07

Ramadheer Singh


People also ask

How do you allocate memory to a string?

Allocating Strings DynamicallyEdit In duplicating a string, s, for example we would need to find the length of that string: int len = strlen(s); And then allocate the same amount of space plus one for the terminator and create a variable that points to that area in memory: char *s2 = malloc((len + 1) * sizeof(char));

Is std::string dynamically allocated?

Inside every std::string is a dynamically allocated array of char .

Is std::string allocated on the heap?

The string object itself is stored on the stack but it points to memory that is on the heap. Why? The language is defined such that the string object is stored on the stack. string's implementation to construct an object uses memory on the heap.

How much memory is allocated to a string C++?

clang / Linux / x86 / libc++ : 12 bytes.


2 Answers

std::string has a .reserve method for pre-allocation.

std::string s; s.reserve(1048576); // reserve 1 MB read_file_into(s); 
like image 54
kennytm Avatar answered Sep 23 '22 10:09

kennytm


This isn't so much an answer in itself, as a kind of a comment on/summary/comparison of a couple of other answers (as well as a quick demonstration of why I've recommended the style of code @Johannes - litb gives in his answer). Since @sbi posted an alternative that looked pretty good, and (especially) avoided the extra copy involved in reading into a stringstream, then using the .str() member to get a string, I decided to write up a quick comparison of the two:

[ Edit: I've added a third test case using @Tyler McHenry's istreambuf_iterator-based code, and added a line to print out the length of each string that was read to ensure that the optimizer didn't optimize away the reading because the result was never used.]

[ Edit2: And now, code from Martin York has been added as well...]

#include <fstream> #include <sstream> #include <string> #include <iostream> #include <iterator> #include <time.h>  int main() {     std::ostringstream os;     std::ifstream file("equivs2.txt");      clock_t start1 = clock();     os << file.rdbuf();     std::string s = os.str();     clock_t stop1 = clock();      std::cout << "\ns.length() = " << s.length();      std::string s2;      clock_t start2 = clock();     file.seekg( 0, std::ios_base::end );     const std::streampos pos = file.tellg();     file.seekg(0, std::ios_base::beg);      if( pos!=std::streampos(-1) )         s2.reserve(static_cast<std::string::size_type>(pos));     s2.assign(std::istream_iterator<char>(file), std::istream_iterator<char>());     clock_t stop2 = clock();      std::cout << "\ns2.length = " << s2.length();      file.clear();      std::string s3;      clock_t start3 = clock();        file.seekg(0, std::ios::end);        s3.reserve(file.tellg());     file.seekg(0, std::ios::beg);      s3.assign((std::istreambuf_iterator<char>(file)),             std::istreambuf_iterator<char>());     clock_t stop3 = clock();      std::cout << "\ns3.length = " << s3.length();      // New Test     std::string s4;      clock_t start4 = clock();     file.seekg(0, std::ios::end);     s4.resize(file.tellg());     file.seekg(0, std::ios::beg);      file.read(&s4[0], s4.length());     clock_t stop4 = clock();      std::cout << "\ns4.length = " << s3.length();      std::cout << "\nTime using rdbuf: " << stop1 - start1;     std::cout << "\nTime using istream_iterator: " << stop2- start2;     std::cout << "\nTime using istreambuf_iterator: " << stop3 - start3;     std::cout << "\nTime using read: " << stop4 - start4;     return 0; } 

Now the impressive part -- the results. First with VC++ (in case somebody cares, Martin's code is fast enough I increased the file size to get a meaningful time for it):

s.length() = 7669436
s2.length = 6390688
s3.length = 7669436
s4.length = 7669436
Time using rdbuf: 184
Time using istream_iterator: 1332
Time using istreambuf_iterator: 249
Time using read: 48

Then with gcc (cygwin):

s.length() = 8278035
s2.length = 6390689
s3.length = 8278035
s4.length = 8278035
Time using rdbuf: 62
Time using istream_iterator: 2199
Time using istreambuf_iterator: 156
Time using read: 16

[ end of edit -- the conclusions remain, though the winner has changed -- Martin's code is clearly the fastest. ]

The results are quite consistent with respect to which is fastest and slowest. The only inconsistency is with how much faster or slower one is than another. Though the placements are the same, the speed differences are much larger with gcc than with VC++.

like image 38
Jerry Coffin Avatar answered Sep 22 '22 10:09

Jerry Coffin