Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Doesn't C++ mandate that (cond ? string_1 : string_2) initialize a string?

Considering:

void foo(std::string& s);

Inside this function, the expression s is lvalue std::string (not std::string&), because references don't really "exist" in expressions:

[expr.type/1]: If an expression initially has the type “reference to T ([dcl.ref], [dcl.init.ref]), the type is adjusted to T prior to any further analysis. The expression designates the object or function denoted by the reference, and the expression is an lvalue or an xvalue, depending on the expression. [..]

Now consider:

const std::string& foo(const std::string& s1, const std::string& s2)
{
    return (s1.size() < s2.size() ? s1 : s2);
}

There was a debate on another question about whether the conditional operator here involves the creation of a temporary (which then has ramifications about the return value for foo being a dangling reference).

My interpretation was that, yes, it must, because:

[expr.cond/5]: If the second and third operands are glvalues of the same value category and have the same type, the result is of that type and value category and it is a bit-field if the second or the third operand is a bit-field, or if both are bit-fields.

and:

[expr.cond/7.1]: The second and third operands have the same type; the result is of that type and the result object is initialized using the selected operand.

Initialising a std::string from a std::string involves a copy.

However, I was surprised that GCC didn't warn on the dangling reference. Investigating, I found that foo indeed does propagate the reference semantics for the selected argument:

#include <string>
#include <iostream>

using std::string;
using std::cout;

void foo(string& s1, string& s2)
{
    auto& s3 = (s1.size() < s2.size() ? s1 : s2);
    s3 = "what";
}

int main()
{
    string s1 = "hello";
    string s2 = "world";
    
    foo(s1, s2);
    
    cout << s1 << ' ' << s2 << '\n';   // Output: hello what
}

(live demo)

The original s2, passed by reference into foo, has been selected by the conditional operator, then bound to s3, and modified. There is no evidence of any copying going on.

This doesn't match my reading of how expressions work and of how the conditional operator works.

So, which of my above statements is incorrect, and why?


Since there seems to be some confusion, below I have diagrammed what my understanding says is the chain of events. I realise that it's wrong — my testcase above proves that. But I'd like to understand exactly why. Ideally I'd like some standard wording, not just "you're wrong". I already know I'm wrong. That's why I'm asking. 😀

  1. References to strings passed into function
  2. Expression evaluated containing conditional operator
    • The latter two operands are lvalue expressions of type const std::string (not references!)
    • The latter two operands have the same type and value category, so the conditional operator's result is const std::string, too
  3. The result of the expression is initialised from the selected operand; we've already established that the operands and the result type are const std::string, so it's a const std::string initialised from a const std::string
  4. The expression, as one that initialises an object, has value category rvalue (and I believe this implies the object is also a temporary?)
  5. Then we initialise the function's return value from that temporary; this is evil as the return type is a reference, so we dangle.
like image 231
Asteroids With Wings Avatar asked Jul 31 '20 19:07

Asteroids With Wings


People also ask

What does it mean when a string is initialized to 0?

Note that this creates an unterminated string (that is, one without a 0 value to mark its end) and generates a diagnostic message indicating this condition. If the string is shorter than the specified array size, the remaining elements of the array are initialized to 0. In Microsoft C, string literals can be up to 2048 bytes in length.

How to initialize a string variable in C?

Following example demonstrates the initialization of Strings in C, In string3, the NULL character must be added explicitly, and the characters are enclosed in single quotation marks. 'C' also allows us to initialize a string variable without defining the size of the character array.

How many characters are assigned to a string initializer?

Only the first three characters of the initializer are assigned to code. The character d and the string-terminating null character are discarded. Note that this creates an unterminated string (that is, one without a 0 value to mark its end) and generates a diagnostic message indicating this condition.

What is a string in C?

Introduction to String in C. String in C is defined as an array of characters that are terminated with a special character (Null character) ‘\0’. So a non-finished string includes the characters consisting of the list preceded by a null. Defining a string is similar to defining a one-dimensional array of characters.


2 Answers

From the very section you quote:

If the second and third operands are glvalues of the same value category and have the same type, the result is of that type and value category and it is a bit-field if the second or the third operand is a bit-field, or if both are bit-fields.

The second and third operands are both lvalues of type std::string const, so the result is an lvalue of type std::string const.

Initialising a std::string from a std::string involves a copy.

But we're not initializing a std::string from a std::string. In:

const std::string& foo(const std::string& s1, const std::string& s2)
{
    return (s1.size() < s2.size() ? s1 : s2);
}

We're initializing a std::string const& from an lvalue of type std::string const. That's just a direct reference binding. No copy necessary.

like image 70
Barry Avatar answered Oct 31 '22 08:10

Barry


My misunderstanding appears to have stemmed from step #3 in my "diagram": the wording I quoted regarding initialising the result ([expr.cond/7.1]) doesn't apply; it's under the "otherwise, the result is a prvalue" clause. I'd missed that.

So, there is in fact no talk about initialisation with respect to our conditional operator expression here. Thus, no new object being created and, if such an object doesn't exist, it cannot be a temporary.

The only description of what we get back, then, is:

[expr.cond/1]: [..] the result of the conditional expression is the value of the second expression, otherwise that of the third expression.

I'd actually maintain that this is not the clearest wording, but when compared to similar wording in e.g. the rules for the built-in subscript operator (which doesn't return a reference type, but its result is "the value" referred to by its two operands), it does seem unambiguous enough that the whole expression here "is" one of the original strings.

like image 30
Asteroids With Wings Avatar answered Oct 31 '22 06:10

Asteroids With Wings