Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are structured bindings defined in terms of a uniquely named variable?

Why are structured bindings defined through a uniquely named variable and all the vague "name is bound to" language?

I personally thought structured bindings worked as follows. Given a struct:

struct Bla
{
    int i;
    short& s;
    double* d;
} bla;

The following:

cv-auto ref-operator [a, b, c] = bla;

is (roughly) equivalent to

cv-auto ref-operator a = bla.i;
cv-auto ref-operator b = bla.s;
cv-auto ref-operator c = bla.d;

And the equivalent expansions for arrays and tuples. But apparently, that would be too simple and there's all this vague special language used to describe what needs to happen.

So I'm clearly missing something, but what is the exact case where a well-defined expansion in the sense of, let's say, fold expressions, which a lot simpler to read up on in standardese?

It seems all the other behaviour of the variables defined by a structured binding actually follow the as-if simple expansion "rule" I'd think would be used to define the concept.

like image 395
rubenvb Avatar asked Apr 12 '18 13:04

rubenvb


2 Answers

Structured binding exists to allow for multiple return values in a language that doesn't allow a function to resolve to more than one value (and thus does not disturb the C++ ABI). The means that whatever syntax is used, the compiler must ultimately store the actual return value. And therefore, that syntax needs a way to talk about exactly how you're going to store that value. Since C++ has some flexibility in how things are stored (as references or as values), the structured binding syntax needs to offer the same flexibility.

Hence the auto & or auto&& or auto choice applying to the primary value rather than the subobjects.

Second, we don't want to impact performance with this feature. Which means that the names introduced will never be copies of the subobjects of the main object. They must be either references or the actual subobjects themselves. That way, people aren't concerned about the performance impact of using structured binding; it is pure syntactic sugar.

Third, the system is designed to handle both user-defined objects and arrays/structs with all public members. In the case of user-defined objects, the "name is bound to" a genuine language reference, the result of calling get<I>(value). If you store a const auto& for the object, then value will be a const& to that object, and get will likely return a const&.

For arrays/public structs, the "names are bound to" something which is not a reference. These are treated exactly like you types value[2] or value.member_name. Doing decltype on such names will not return a reference, unless the unpacked member itself is a reference.

By doing it this way, structured binding remains pure syntactic sugar: it accesses the object in whatever is the most efficient way possible for that object. For user-defined types, that's calling get exactly once per subobject and storing references to the results. For other types, that's using a name that acts like an array/member selector.

like image 116
Nicol Bolas Avatar answered Oct 10 '22 08:10

Nicol Bolas


It seems all the other behaviour of the variables defined by a structured binding actually follow the as-if simple expansion "rule" I'd think would be used to define the concept.

It kind of does. Except the expansion isn't based on the expression on the right hand side, it's based on the introduced variable. This is actually pretty important:

X foo() {
    /* a lot of really expensive work here */
   return {a, b, c};
}

auto&& [a, b, c] = foo();

If that expanded into:

// note, this isn't actually auto&&, but for the purposes of this example, let's simplify
auto&& a = foo().a;
auto&& b = foo().b;
auto&& c = foo().c;

It wouldn't just be extremely inefficient, it could also be actively wrong in many cases. For instance, imagine if foo() was implemented as:

X foo() {
    X x;
    std::cin >> x.a >> x.b >> x.c;
    return x;
}

So instead, it expands into:

auto&& e = foo();
auto&& a = e.a;
auto&& b = e.b;
auto&& c = e.c;

which is really the only way to ensure that all of our bindings come from the same object without any extra overhead.

And the equivalent expansions for arrays and tuples. But apparently, that would be too simple and there's all this vague special language used to describe what needs to happen.

There's three cases:

  1. Arrays. Each binding acts as if it's an access into the appropriate index.
  2. Tuple-like. Each binding comes from a call to std::get<I>.
  3. Aggregate-like. Each binding names a member.

That's not too bad? Hypothetically, #1 and #2 could be combined (could add the tuple machinery to raw arrays), but then it's potentially more efficient not to do this.

A healthy amount of the complexity in the wording (IMO) comes from dealing with the value categories. But you'd need that regardless of the way anything else is specified.

like image 21
Barry Avatar answered Oct 10 '22 08:10

Barry