Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reference to uninitialized memory. Undefined behavior?

Allow me to preface by saying that I don't recommend any of the practices below, for obvious reasons. However, I had a discussion today regarding it and some people were adamant about using a reference like this as being undefined behavior.

Here is a test case:

#include <string>

struct my_object {
   int a          = 1;
   int b          = 2;
   std::string hi = "hello";
};

// Using union purely to reserve uninitialized memory for a class.
union my_object_storage {
   char dummy;
   my_object memory;
   // C++ will yell at you for doing this without some constructors.
   my_object_storage() {}
   ~my_object_storage() {}
} my_object_storage_instance;

// This is so we can easily access the storage memory through "I"
constexpr my_object &I = my_object_storage_instance.memory;

//-------------------------------------------------------------
int main() {
   // Initialize the object.
   new (&I) my_object();
   // Use the reference.
   I.a = 1;
   // Destroy the object (typically this should be done using RAII).
   I.~my_object();

   // Phase two, REINITIALIZE an object with the SAME reference.
   // We still have the memory allocated which is static, so why not?
   new (&I) my_object();
   // Use the reference.
   I.a = 1;  
   // Destroy the object again. 
   I.~my_object();
}

https://wandbox.org/permlink/YEp9aQUcWdA9YiBI

Basically what the code does is reserves static memory for a struct, and then initializes it in main(). Why would you want to do that? It isn't extremely useful and you should just use a pointer, but here is the question:

With this statement given,

constexpr my_object &I = my_object_storage_instance.memory;

is defining a reference to uninitialized memory undefined behavior? Other people have told me it is, but I'm trying to figure out concretely if that's the case. In the C++ standard we see this paragraph:

A reference shall be initialized to refer to a valid object or function. [ Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior.

Specifically "a valid object", which may boil down to: is an object that hasn't had its constructor called yet "valid"? What makes it invalid that it would cause undefined behavior? Are there actually real side effects that could arise?

My argument for this being labeled as undefined behavior is:

  • Compilers might be free to treat it like a valid object, because the standard states that it should be, especially during the assignment and especially if there are hidden debug instructions being inserted for diagnostics that assume such, which would certainly cause undefined behavior.

My arguments against it being undefined behavior is that:

  • It's not dereferencing anything - the paragraph states that, during initialization of a reference, dereferencing nullptr is undefined. It doesn't specifically state undefined behavior if there isn't any dereferencing.
  • Dangling references are a thing, and appear in many cases in normal programs. They only cause undefined behavior IF they are used. This is similar to starting with a dangling reference.

Again, not very useful in practice because there are much better ways to spend your time, but what better place for odd questions and expert opinions than stackoverflow? :)

like image 377
mukunda Avatar asked Jun 08 '19 04:06

mukunda


1 Answers

You're perfectly fine, your usage of the reference falls into the explicit exception to the rule that a live object is required. In [basic.life]:

Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways.

For an object under construction or destruction, see [class.cdtor]. Otherwise, such a glvalue refers to allocated storage ([basic.stc.dynamic.allocation]), and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if:

  • the glvalue is used to access the object, or
  • the glvalue is used to call a non-static member function of the object, or
  • the glvalue is bound to a reference to a virtual base class ([dcl.init.ref]), or
  • the glvalue is used as the operand of a dynamic_­cast ([expr.dynamic.cast]) or as the operand of typeid.

If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:

  • the storage for the new object exactly overlays the storage location which the original object occupied, and
  • the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
  • the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
  • neither the original object nor the new object is a potentially-overlapping subobject ([intro.object]).

Thus, your reference validly refers to allocated storage, which is exactly what you need to perform a placement-new and vivify the union member.

And since the dynamic (runtime) type of the object you create exactly matches the static type of the reference you hold, it can be used to access the new object after placement new (either the first or the second).

like image 82
Ben Voigt Avatar answered Oct 21 '22 05:10

Ben Voigt