Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it allowed to print the address of a dangling reference?

Consider this code, which is slightly modified from here:

#include <iostream>

void foo() {
    int i;
    static auto f = [&i]() { std::cout << &i << "\n";};
    f();
}

int main() {
    foo();
    foo();
}

The lambda f is initialized only on the first call, during the second call the captured variable ceased to exists, the lambda holds a dangling reference, but only prints its address. No obvious issue with gcc and output looks ok:

0x7ffc25301ddc
0x7ffc25301ddc

Is it undefined behavior to take the address of a dangling reference, or is it ok?

For a very similar example gcc ( -Wall -Werror -pedantic -O3) produces a warning:

#include <iostream>

auto bar() {
    int i;
    return [&i]() {std::cout << &i << "\n"; };
}

int main() {
    bar()();
}

warning:

source>:5:14: error: address of stack memory associated with local variable 'i' returned [-Werror,-Wreturn-stack-address]
    return [&i]() {std::cout << &i << "\n"; };
             ^
<source>:5:14: note: captured by reference here
    return [&i]() {std::cout << &i << "\n"; };

Of course, the fact that gcc compiles the first example and produces expected(?) output while warns for the second does not mean a thing. Where in the standard I can find whether using the address of a dangling reference is fine or not?

PS: I suppose the answer is somewhere in [basic.life], though I was browsing it several times, but I have a hard time to see what applies and what it is trying to tell me.

like image 466
463035818_is_not_a_number Avatar asked Sep 13 '21 20:09

463035818_is_not_a_number


People also ask

What is the problem of dangling reference?

A dangling reference is a reference to an object that no longer exists. Garbage is an object that cannot be reached through a reference. Dangling references do not exist in garbage collected languages because objects are only reclaimed when they are no longer accessible (only garbage is collected).

How can the dangling reference problem be avoided?

We can avoid the dangling pointer errors by initialize pointer to NULL , after de-allocating memory, so that pointer will be no longer dangling. Assigning NULL value means pointer is not pointing to any memory location.

What is meant by dangling references?

A link or pointer to an instruction, table element, index item, etc. that no longer contains the same content. If the reference is not a currently valid address, or if it is valid but there is no content in that location, it may cause the computer to crash if the software is not programmed carefully.

How do you handle a dangling pointer?

The dangling pointer errors can be avoided by initializing the pointer to the NULL value. If we assign the NULL value to the pointer, then the pointer will not point to the de-allocated memory. Assigning NULL value to the pointer means that the pointer is not pointing to any memory location.


Video Answer


1 Answers

I believe this is poorly specified, but may be implementation-defined.

The question and the other answer presumes that i is a dangling reference. That presumes that it is a reference at all. But that's not correct!

It is notable that a reference capture is not a reference. The standard explicitly and intentionally says that a reference capture may not result in non-static data members of the closure type. [expr.prim.lambda/12]:

It is unspecified whether additional unnamed non-static data members are declared in the closure type for entities captured by reference.

That's why the rewriting for entity names only happens to copy captures. [expr.prim.lambda/11]:

Every id-expression within the compound-statement of a lambda-expression that is an odr-use of an entity captured by copy is transformed into an access to the corresponding unnamed data member of the closure type.

The same is not true of reference captures. The id-expression i within the lambda body refers to the original entity. It is not, as one might reasonably assume, a non-static member of the closure type which acts as an int&.

As far as I can tell, this dates back to some rewording in N2927 before C++11. Prior to that, during standardization, reference captures apparently did result in closure type members and did trigger a rewrite in the body just as copy captures. The change was intentional.

So... the lambda body names an object i of type int which on the second invocation is not only outside its lifetime, but the storage has also been released.

With that in mind, let's try to infer if that's okay.

The standard explicitly allows using the name outside lifetime but before storage re-use. [basic.life/7]:

after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see [class.cdtor]. Otherwise, such a glvalue refers to allocated storage ([basic.stc.dynamic.allocation]), and using the properties of the glvalue that do not depend on its value is well-defined.

That doesn't actually apply, because here storage is released. However, when storage is not released, you can infer that the committee generally intends that naming entities that do not depend on the value of it are OK. In practice, mostly avoid the lvalue-to-rvalue conversion.

The standard also explicitly invalidates pointers on storage release. [basic.stc.general/4]:

When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values. Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.

We don't have a pointer. Of note, references aren't "zapped", but we don't have a reference either.

So, how do we put this together?

Is naming i alone a problem? It is explicitly allowed to name i after its lifetime but before storage release. I cannot find any prohibition against naming i after storage release. It must refer to the same object, which is outside its lifetime. In other words, the rules say i is an lvalue representing some object, and they also say that continues after the object lifetime. They do not say it stops at storage release.

Is using but not accessing i a problem? By taking the address, we do not trigger lvalue-to-rvalue conversion, and we do not "access" i. I cannot find a prohibition. The address operator ([expr.unary.op/3]) says it will return the address of the designated object, which is the object the lvalue names.

What is the result of &i? The language about pointer zapping could be read to mean that the result, which is a pointer representing the address of storage which was released, must be an invalid pointer value.

Can we print &i? The language on invalid pointer values is clear that indirection and deallocation are undefined, but everything else is implementation-defined.

So... it may be implementation-defined.

like image 155
Jeff Garrett Avatar answered Nov 15 '22 01:11

Jeff Garrett