Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::strings's capacity(), reserve() & resize() functions

Tags:

c++

string

stl

I wan to use std::string simply to create a dynamic buffer and than iterate through it using an index. Is resize() the only function to actually allocate the buffer?

I tried reserve() but when I try to access the string via index it asserts. Also when the string's default capacity seems to be 15 bytes (in my case) but if I still can't access it as my_string[1].

So the capacity of the string is not the actual buffer? Also reserve() also does't allocate the actual buffer?

string my_string;

// I want my string to have 20 bytes long buffer
my_string.reserve( 20 );

int i = 0;

for ( parsing_something_else_loop )
{
    char ch = <business_logic>;

    // store the character in 
    my_string[i++] = ch; // this crashes
}

If I do resize() instead of reserve() than it works fine. How is it that the string has the capacity but can't really access it with []? Isn't that the point to reserve() size so you can access it?

Add-on In response to the answers, I would like to ask stl folks, Why would anybody use reserve() when resize() does exactly the same and it also initialize the string? I have to say I don't appreciate the performance argument in this case that much. All that resize() does additional to what reserve() does is that it merely initialize the buffer which we know is always nice to do anyways. Can we vote reserve() off the island?

like image 622
zar Avatar asked Mar 01 '12 18:03

zar


2 Answers

Isn't that the point to reserve() size so you can access it?

No, that's the point of resize().

reserve() only gives to enough room so that future call that leads to increase of the size (e.g. calling push_back()) will be more efficient.

From your use case it looks like you should use .push_back() instead.

my_string.reserve( 20 );

for ( parsing_something_else_loop )
{
    char ch = <business_logic>;
    my_string.push_back(ch);
}

How is it that the string has the capacity but can't really access it with []?

Calling .reserve() is like blowing up mountains to give you some free land. The amount of free land is the .capacity(). The land is there but that doesn't mean you can live there. You have to build houses in order to move in. The number of houses is the .size() (= .length()).

Suppose you are building a city, but after building the 50th you found that there is not enough land, so you need to found another place large enough to fit the 51st house, and then migrate the whole population there. This is extremely inefficient. If you knew you need to build 1000 houses up-front, then you can call

my_string.reserve(1000);

to get enough land to build 1000 houses, and then you call

my_string.push_back(ch);

to construct the house with the assignment of ch to this location. The capacity is 1000, but the size is still 1. You may not say

my_string[16] = 'c';

because the house #16 does not exist yet. You may call

my_string.resize(20);

to get houses #0 ~ #19 built in one go, which is why

my_string[i++] = ch;

works fine (as long as 0 ≤ i ≤ 19).

See also http://en.wikipedia.org/wiki/Dynamic_array.


For your add-on question,

.resize() cannot completely replace .reserve(), because (1) you don't always need to use up all allocated spaces, and (2) default construction + copy assignment is a two-step process, which could take more time than constructing directly (esp. for large objects), i.e.

#include <vector>
#include <unistd.h>

struct SlowObject
{
    SlowObject() { sleep(1); }
    SlowObject(const SlowObject& other) { sleep(1); }
    SlowObject& operator=(const SlowObject& other) { sleep(1); return *this; }
};

int main()
{
    std::vector<SlowObject> my_vector;

    my_vector.resize(3);
    for (int i = 0; i < 3; ++ i)
        my_vector[i] = SlowObject();

    return 0;
}

Will waste you at least 9 seconds to run, while

int main()
{
    std::vector<SlowObject> my_vector;

    my_vector.reserve(3);
    for (int i = 0; i < 3; ++ i)
        my_vector.push_back(SlowObject());

    return 0;
}

wastes only 6 seconds.

std::string only copies std::vector's interface here.

like image 133
kennytm Avatar answered Dec 21 '22 13:12

kennytm


No -- the point of reserve is to prevent re-allocation. resize sets the usable size, reserve does not -- it just sets an amount of space that's reserved, but not yet directly usable.

Here's one example -- we're going to create a 1000-character random string:

static const int size = 1000;
std::string x;
x.reserve(size);
for (int i=0; i<size; i++)
   x.push_back((char)rand());

reserve is primarily an optimization tool though -- most code that works with reserve should also work (just, possibly, a little more slowly) without calling reserve. The one exception to that is that reserve can ensure that iterators remain valid, when they wouldn't without the call to reserve.

like image 42
Jerry Coffin Avatar answered Dec 21 '22 12:12

Jerry Coffin