Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::string and multiple concatenations

Let’s consider that snippet, and please suppose that a, b, c and d are non-empty strings.

    std::string a, b, c, d;
    d = a + b + c;

When computing the sum of those 3 std::string instances, the standard library implementations create a first temporary std::string object, copy in its internal buffer the concatenated buffers of a and b, then perform the same operations between the temporary string and the c.

A fellow programmer was stressing that instead of this behaviour, operator+(std::string, std::string) could be defined to return a std::string_helper.

This object’s very role would be to defer the actual concatenations to the moment where it’s casted into a std::string. Obviously, operator+(std::string_helper, std::string) would be defined to return the same helper, which would "keep in mind" the fact that it has an additional concatenation to carry out.

Such a behavior would save the CPU cost of creating n-1 temporary objects, allocating their buffer, copying them, etc. So my question is: why doesn’t it already work like that ?I can’t think of any drawback or limitation.

like image 739
qdii Avatar asked Dec 01 '22 23:12

qdii


2 Answers

why doesn’t it already work like that?

I can only speculate about why it was originally designed like that. Perhaps the designers of the string library simply didn't think of it; perhaps they thought the extra type conversion (see below) might make the behaviour too surprising in some situations. It is one of the oldest C++ libraries, and a lot of wisdom that we take for granted simply didn't exist in past decades.

As to why it hasn't been changed to work like that: it could break existing code, by adding an extra user-defined type conversion. Implicit conversions can only involve at most one user-defined conversion. This is specified by C++11, 13.3.3.1.2/1:

A user-defined conversion sequence consists of an initial standard conversion sequence followed by a user-defined conversion followed by a second standard conversion sequence.

Consider the following:

struct thingy {
    thingy(std::string);
};

void f(thingy);

f(some_string + another_string);

This code is fine if the type of some_string + another_string is std::string. That can be implicitly converted to thingy via the conversion constructor. However, if we were to change the definition of operator+ to give another type, then it would need two conversions (string_helper to string to thingy), and so would fail to compile.

So, if the speed of string building is important, you'll need to use alternative methods like concatenation with +=. Or, according to Matthieu's answer, don't worry about it because C++11 fixes the inefficiency in a different way.

like image 180
Mike Seymour Avatar answered Dec 20 '22 23:12

Mike Seymour


The obvious answer: because the standard doesn't allow it. It impacts code by introducing an additional user defined conversion in some cases: if C is a type having a user defined constructor taking an std::string, then it would make:

C obj = stringA + stringB;

illegal.

like image 45
James Kanze Avatar answered Dec 21 '22 00:12

James Kanze