Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can initializing expression use the variable itself?

Consider the following code:

#include <iostream>

struct Data
{
    int x, y;
};

Data fill(Data& data)
{
    data.x=3;
    data.y=6;
    return data;
}

int main()
{
    Data d=fill(d);
    std::cout << "x=" << d.x << ", y=" << d.y << "\n";
}

Here d is copy-initialized from the return value of fill(), but fill() writes to d itself before returning its result. What I'm concerned about is that d is non-trivially used before being initialized, and use of uninitialized variables in some(all?) cases leads to undefined behavior.

So is this code valid, or does it have undefined behavior? If it's valid, will the behavior become undefined once Data stops being POD or in some other case?

like image 697
Ruslan Avatar asked Nov 11 '15 11:11

Ruslan


People also ask

What happens when you initialize a variable?

Initializing a variable means specifying an initial value to assign to it (i.e., before it is used at all). Notice that a variable that is not initialized does not have a defined value, hence it cannot be used until it is assigned such a value.

What will happen if you use the variable without initializing it?

Solution. If a variable is declared but not initialized or uninitialized and if those variables are trying to print, then, it will return 0 or some garbage value. Whenever we declare a variable, a location is allocated to that variable.

What does initializing a variable mean?

To initialize a variable is to give it a correct initial value. It's so important to do this that Java either initializes a variable for you, or it indicates an error has occurred, telling you to initialize a variable.

What is the difference between initializing and declaring a variable?

Declaration tells the compiler about the existence of an entity in the program and its location. When you declare a variable, you should also initialize it. Initialization is the process of assigning a value to the Variable. Every programming language has its own method of initializing the variable.


1 Answers

This does not seem like valid code. It is similar to the case outlined in the question: Is passing a C++ object into its own constructor legal?, although in that case the code was valid. The mechanics are not identical but the base reasoning can at least get us started.

We start with defect report 363 which asks:

And if so, what is the semantics of the self-initialization of UDT? For example

 #include <stdio.h>

 struct A {
        A()           { printf("A::A() %p\n",            this);     }
        A(const A& a) { printf("A::A(const A&) %p %p\n", this, &a); }
        ~A()          { printf("A::~A() %p\n",           this);     }
 };

 int main()
 {
  A a=a;
 }

can be compiled and prints:

A::A(const A&) 0253FDD8 0253FDD8
A::~A() 0253FDD8

and the proposed resolution was:

3.8 [basic.life] paragraph 6 indicates that the references here are valid. It's permitted to take the address of a class object before it is fully initialized, and it's permitted to pass it as an argument to a reference parameter as long as the reference can bind directly. [...]

So although d is not fully initialized we can pass it as a reference.

Where we start to get into trouble is here:

data.x=3;

The draft C++ standard section 3.8(The same section and paragraph the defect report quotes) says (emphasis mine):

Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise, such a glvalue refers to allocated storage (3.7.4.2), and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if:

  • an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,

  • the glvalue is used to access a non-static data member or call a non-static member function of the object, or

  • the glvalue is bound to a reference to a virtual base class (8.5.3), or

  • the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.

So what does access mean? That was clarified with defect report 1531 which defines access as:

access

to read or modify the value of an object

So fill accesses a non-static data member and hence we have undefined behavior.

This also agrees with section 12.7 which says:

[...]To form a pointer to (or access the value of) a direct non-static member of an object obj, the construction of obj shall have started and its destruction shall not have completed, otherwise the computation of the pointer value (or accessing the member value) results in undefined behavior.

Since you are using a copy anyway you might as well create an instance of Data inside of fill and initialize that. The you avoid having to pass d.

As pointed out by T.C. it is important to explicitly quote the details on when lifetime starts. From section 3.8:

The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. — end note ] The lifetime of an object of type T begins when:

  • storage with the proper alignment and size for type T is obtained, and

  • if the object has non-trivial initialization, its initialization is complete.

The initialization is non-trivial since we are initializing via the copy constructor.

like image 70
Shafik Yaghmour Avatar answered Sep 28 '22 08:09

Shafik Yaghmour