For the following C++14 code, why does g++'s generated code for <code>new A[1]{x}</code> seem to invoke the copy constructor twice? <pre class="prettyprint"><code>#include <iostream> using namespace std; class A { public: A() { cout << "default ctor" << endl; } A(const A& o) { cout << "copy ctor" << endl; } ~A() { cout << "dtor" << endl; } }; int main() { A x; cout << "=========" << endl; A* y = new A[1]{x}; cout << "=========" << endl; delete[] y; return 0; } </code></pre> Compilation and output: <pre class="prettyprint"><code>$ g++ -fno-elide-constructors -std=c++14 test.cpp && ./a.out default ctor ========= copy ctor copy ctor dtor ========= dtor dtor </code></pre> Interestingly, for the same code, clang++ only invokes the copy constructor once: <pre class="prettyprint"><code>$ clang++ -fno-elide-constructors -std=c++14 test.cpp && ./a.out default ctor ========= copy ctor ========= dtor dtor </code></pre> Furthermore, when using g++, changing the <code>A* y = new A[1]{x};</code> line to any of the following will cause the copy constructor to be called only once: <ul> <li> <code>A* y = new A {x};</code> - normal heap object instead of heap array of size 1</li> <li> <code>A y[1] {x};</code> - array on stack instead of heap</li> </ul> So it appears that the double copy constructor behavior is only exhibited in heap-array initialization.

TL;DR: It's likely a GCC defect, a misinterpretation of <code>{x}</code> as temporary in this context. For each element in <code>new A[N]{x1, x2, ... xN}</code>, the copy constructor should get called once according to <code>[decl.init]</code> and <code>[new.expr]</code>. Instead, GCC likely interprets it as initializer list and thus in part as intermediate rvalue. We can force GCC to interpret it otherwise, though. <hr> <blockquote> why does g++'s generated code for <code>new A[1]{x}</code> seem to invoke the copy constructor twice? </blockquote> Because there is no move constructor. If we add a move constructor and some more output, we get a better picture of the situation (Compiler Explorer): <pre class="prettyprint lang-cpp prettyprint-override"><code>#include <iostream> using namespace std; class A { public: A() { cout << "default ctor @" << this << endl; } A(A&& o) { cout << "move ctor: " << &o << " to " << this << endl; } A(const A& o) { cout << "copy ctor: " << &o << " to " << this << endl; } ~A() { cout << "dtor @" << this << endl; } }; int main() { A x; cout << "=========" << endl; A* y = new A[1]{x}; cout << "=========" << endl; delete[] y; return 0; } </code></pre> Note that the existence of our new <code>A(A&&)</code> constructor shows us the inbetween temporary: <pre class="prettyprint lang-none prettyprint-override"><code>default ctor @0x7ffec28b5476 ========= copy ctor: 0x7ffec28b5476 to 0x7ffec28b5477 move ctor: 0x7ffec28b5477 to 0x55d0a7fa6288 dtor @0x7ffec28b5477 ========= dtor @0x55d0a7fa6288 dtor @0x7ffec28b5476 </code></pre> Indeed, if we <code>A(A&&) = delete</code> the constructor, <code>g++</code> won't even compile it anymore (but Clang still accepts it). It seems like g++ misinterprets the braced-init-list. IMHO, <code>[expr.new]</code> may allow that kind of interpretation, but this seems like a g++ defect and should probably get reported as such. However, the whole ordeal reminds me of an older question of mine (Are curly braces really required around initialization?). So let's introduce more braces to make sure that <code>g++</code> cannot misinterpret our initializer: <pre class="prettyprint lang-cpp prettyprint-override"><code>int main() { A x; cout << "=========" << endl; A* y = new A[1]{{{x}}}; cout << "=========" << endl; delete[] y; return 0; } </code></pre> This variant circumvents g++'s behaviour: <pre class="prettyprint"><code>initializer for T[1] start : { initializer for first element : { actual initializer for A : {x} </code></pre> The program output is then (Explorer) <pre class="prettyprint lang-none prettyprint-override"><code>default ctor @0x7ffede3d9967 ========= copy ctor: 0x7ffede3d9967 to 0x1eb0ec8 ========= dtor @0x1eb0ec8 dtor @0x7ffede3d9967 </code></pre> So for multiple elements, we end up in brace-hell (Compiler Explorer): <pre class="prettyprint lang-cpp prettyprint-override"><code>int main() { A x; cout << "=========" << endl; A* y = new A[2]{{{x},{{x}}}; cout << "=========" << endl; delete[] y; return 0; } </code></pre> Again, no additional constructors are called: <pre class="prettyprint lang-none prettyprint-override"><code>default ctor @0x7fff3a2a7a27 ========= copy ctor: 0x7fff3a2a7a27 to 0x1f49ec8 copy ctor: 0x7fff3a2a7a27 to 0x1f49ec9 ========= dtor @0x1f49ec9 dtor @0x1f49ec8 dtor @0x7fff3a2a7a27 </code></pre>

After doing some research in the standard I came to the conclusion that g++ is wrong and there should be only one copy constructor invocation. What is interesting it seems that there can be two interpretations of which type of initialization occurs here. Both lead to the same conclusion though. <h3>First interpretation - direct initialization</h3> From the C++14 Standard (Working Draft), [expr.new] 17: <blockquote> A new-expression that creates an object of type <code>T</code> initializes that object as follows: <ul> <li>(17.1) — If the new-initializer is omitted, the object is default-initialized (8.5). [ Note: If no initialization is performed, the object has an indeterminate value. — end note ]</li> <li>(17.2) — Otherwise, the new-initializer is interpreted according to the initialization rules of 8.5 for direct initialization.</li> </ul> </blockquote> In our case the new-initializer is present, so (according to 17.2) <code>new A[1]{x}</code> is interpreted using direct initialization rules. Let's look at [dcl.init] 16: <blockquote> The initialization that occurs in the forms <ul> <li><code>T x(a);</code></li> <li><code>T x{a};</code></li> </ul> as well as in <code>new</code> expressions (5.3.4), <code>static_cast</code> expressions (5.2.9), functional notation type conversions (5.2.3), mem-initializers (12.6.2), and the braced-init-list form of a condition is called direct-initialization </blockquote> Ok, this further confirms that we are dealing with direct initialization. Now let's see how direct initialization works in [dcl.init] 17: <blockquote> The semantics of initializers are as follows. The destination type is the type of the object or reference being initialized and the source type is the type of the initializer expression. If the initializer is not a single (possibly parenthesized) expression, the source type is not defined. <ul> <li>[... 17.1 through 17.5 omitted ...]</li> <li>(17.6) — If the destination type is a (possibly cv-qualified) class type: <ul> <li>(17.6.1) — If the initialization is direct-initialization, or if it is copy-initialization where the cv-unqualified version of the source type is the same class as, or a derived class of, the class of the destination, constructors are considered. The applicable constructors are enumerated (13.3.1.3), and the best one is chosen through overload resolution (13.3). The constructor so selected is called to initialize the object, with the initializer expression or expression-list as its argument(s). If no constructor applies, or the overload resolution is ambiguous, the initialization is ill-formed.</li> </ul> </li> </ul> </blockquote> According to the excerpt above, when the object being initialized is a class type (as is the case here) and when dealing with direct initialization (as is the case here) the destination object is initialized using the most suitable constructor. I won't cite the rules about how the constructor is selected, as in this case when there is only the default <code>A::A()</code> constructor and the copy <code>A::A(const A&)</code> constructor, the copy constructor is obviously the better choice when initializing with <code>x</code> of type <code>A</code>. This is the source of one of the copy constructor invocations. I didn't find any remarks about the initialization of arrays in particular in section [expr.new] and why it should cause a second constructor invocation. <h3>Second interpretation - copy initialization</h3> Here, we can start from [dcl.init.list] 1: <blockquote> List-initialization is initialization of an object or reference from a braced-init-list. Such an initializer is called an initializer list, and the comma-separated initializer-clauses of the list are called the elements of the initializer list. An initializer list may be empty. List-initialization can occur in direct-initialization or copy initialization contexts; list-initialization in a direct-initialization context is called direct-list-initialization and list-initialization in a copy-initialization context is called copy-list-initialization. [ Note: List-initialization can be used <ul> <li>(1.1) — as the initializer in a variable definition (8.5)</li> <li>(1.2) — as the initializer in a new-expression (5.3.4)</li> <li>[... 1.3 through 1.10 omitted ...]</li> </ul> — end note ] </blockquote> This excerpt can be understood to say that <code>new A[1]{x}</code> is actually a form of list intialization rather than direct initialization as a braced-init-list <code>{x}</code> is used. Assuming this is the case, let's look at how it works in [dcl.init.list] 3: <blockquote> List-initialization of an object or reference of type <code>T</code> is defined as follows: <ul> <li>[... 3.1 through 3.2 omitted ...]</li> <li>(3.3) — Otherwise, if <code>T</code> is an aggregate, aggregate initialization is performed (8.5.1).</li> <li>[... 3.4 through 3.10 omitted ...]</li> </ul> </blockquote> In our case, point 3.3 applies as we are initializing an array which is an aggregate, according to [dcl.init.aggr] 1: <blockquote> An aggregate is an array or a class (Clause 9) with no user-provided constructors (12.1), no private or protected non-static data members (Clause 11), no base classes (Clause 10), and no virtual functions (10.3). </blockquote> As such let's look at how aggregate initialization is performed in [dcl.init.aggr] 2: <blockquote> When an aggregate is initialized by an initializer list, as specified in 8.5.4, the elements of the initializer list are taken as initializers for the members of the aggregate, in increasing subscript or member order. Each member is copy-initialized from the corresponding initializer-clause. If the initializer-clause is an expression and a narrowing conversion (8.5.4) is required to convert the expression, the program is ill-formed. </blockquote> This fragment tells us that elements are copy initialized. As such <code>y[0]</code> will be copy initialized from <code>x</code>. Now let's look at how copy initialization works in [dcl.init] 17: <blockquote> The semantics of initializers are as follows. The destination type is the type of the object or reference being initialized and the source type is the type of the initializer expression. If the initializer is not a single (possibly parenthesized) expression, the source type is not defined. <ul> <li>[... 17.1 through 17.5 omitted ...]</li> <li>(17.6) — If the destination type is a (possibly cv-qualified) class type: <ul> <li>(17.6.1) — If the initialization is direct-initialization, or if it is copy-initialization where the cv-unqualified version of the source type is the same class as, or a derived class of, the class of the destination, constructors are considered. The applicable constructors are enumerated (13.3.1.3), and the best one is chosen through overload resolution (13.3). The constructor so selected is called to initialize the object, with the initializer expression or expression-list as its argument(s). If no constructor applies, or the overload resolution is ambiguous, the initialization is ill-formed.</li> </ul> </li> </ul> </blockquote> Just like last time, this initialization fulfills the requirements for point 17.6.1 as it is copy-initialization where the source type (<code>A</code> of <code>x</code>) is the same as the destination type (<code>A</code> of <code>y[0]</code>). This means that in this case the copy constructor will be called as well. <h3>Conclusion</h3> It seems that regardless of which interpretation is chosen, only one constructor should be called and that Clang is right. I was unable to find any evidence that a temporary should be created. For some more example-based evidence, other compilers like <code>icc</code>, and (admittedly clang-based) <code>zapcc</code> and <code>elcc</code> agree with clang, all having only one copy constructor invocation. I don't know much about <code>g++</code>'s internal workings, but I have a theory about why it does two copy constructor invocations. It is possible that internally <code>g++</code> uses some helper constructor invocations that are later always optimized out and that the use of the <code>-fno-elide-constructors</code> flag breaks the invariance that they will be always optimized out. This is however pure speculation about <code>g++</code> on my side, so please correct me if I'm wrong.

Why copy constructor called twice in heap array initialization?

Tags:

c++

copy-constructor

g++

c++14

clang++

For the following C++14 code, why does g++'s generated code for new A[1]{x} seem to invoke the copy constructor twice?

#include <iostream>
using namespace std;

class A {
public:
    A()           { cout << "default ctor" << endl; }
    A(const A& o) { cout << "copy ctor" << endl;    }
    ~A()          { cout << "dtor" << endl;         }
};

int main()
{
    A x;
    cout << "=========" << endl;
    A* y = new A[1]{x};
    cout << "=========" << endl;
    delete[] y;
    return 0;
}

Compilation and output:

$ g++ -fno-elide-constructors -std=c++14 test.cpp && ./a.out
default ctor
=========
copy ctor
copy ctor
dtor
=========
dtor
dtor

Interestingly, for the same code, clang++ only invokes the copy constructor once:

$ clang++ -fno-elide-constructors -std=c++14 test.cpp && ./a.out
default ctor
=========
copy ctor
=========
dtor
dtor

Furthermore, when using g++, changing the A* y = new A[1]{x}; line to any of the following will cause the copy constructor to be called only once:

A* y = new A {x}; - normal heap object instead of heap array of size 1
A y[1] {x}; - array on stack instead of heap

So it appears that the double copy constructor behavior is only exhibited in heap-array initialization.

599

asked May 15 '21 03:05

michaeljan

2 Answers

TL;DR: It's likely a GCC defect, a misinterpretation of {x} as temporary in this context. For each element in new A[N]{x1, x2, ... xN}, the copy constructor should get called once according to [decl.init] and [new.expr]. Instead, GCC likely interprets it as initializer list and thus in part as intermediate rvalue. We can force GCC to interpret it otherwise, though.

why does g++'s generated code for new A[1]{x} seem to invoke the copy constructor twice?

Because there is no move constructor. If we add a move constructor and some more output, we get a better picture of the situation (Compiler Explorer):

#include <iostream>
using namespace std;

class A {
public:
    A()           { cout << "default ctor @" << this << endl; }
    A(A&& o)      { cout << "move ctor: " << &o << " to " << this << endl;    }
    A(const A& o) { cout << "copy ctor: " << &o << " to " << this << endl;    }
    ~A()          { cout << "dtor @" << this << endl;         }
};

int main()
{
    A x;
    cout << "=========" << endl;
    A* y = new A[1]{x};
    cout << "=========" << endl;
    delete[] y;
    return 0;
}

Note that the existence of our new A(A&&) constructor shows us the inbetween temporary:

default ctor @0x7ffec28b5476
=========
copy ctor: 0x7ffec28b5476 to 0x7ffec28b5477
move ctor: 0x7ffec28b5477 to 0x55d0a7fa6288
dtor @0x7ffec28b5477
=========
dtor @0x55d0a7fa6288
dtor @0x7ffec28b5476

Indeed, if we A(A&&) = delete the constructor, g++ won't even compile it anymore (but Clang still accepts it).

It seems like g++ misinterprets the braced-init-list. IMHO, [expr.new] may allow that kind of interpretation, but this seems like a g++ defect and should probably get reported as such.

However, the whole ordeal reminds me of an older question of mine (Are curly braces really required around initialization?). So let's introduce more braces to make sure that g++ cannot misinterpret our initializer:

int main()
{
    A x;
    cout << "=========" << endl;
    A* y = new A[1]{{{x}}};
    cout << "=========" << endl;
    delete[] y;
    return 0;
}

This variant circumvents g++'s behaviour:

initializer for T[1]     start : {
initializer for first element  : {
actual initializer for A       : {x}

The program output is then (Explorer)

default ctor @0x7ffede3d9967
=========
copy ctor: 0x7ffede3d9967 to 0x1eb0ec8
=========
dtor @0x1eb0ec8
dtor @0x7ffede3d9967

So for multiple elements, we end up in brace-hell (Compiler Explorer):

int main()
{
    A x;
    cout << "=========" << endl;
    A* y = new A[2]{{{x},{{x}}};
    cout << "=========" << endl;
    delete[] y;
    return 0;
}

Again, no additional constructors are called:

default ctor @0x7fff3a2a7a27
=========
copy ctor: 0x7fff3a2a7a27 to 0x1f49ec8
copy ctor: 0x7fff3a2a7a27 to 0x1f49ec9
=========
dtor @0x1f49ec9
dtor @0x1f49ec8
dtor @0x7fff3a2a7a27

144

answered Sep 19 '22 16:09

Zeta

After doing some research in the standard I came to the conclusion that g++ is wrong and there should be only one copy constructor invocation. What is interesting it seems that there can be two interpretations of which type of initialization occurs here. Both lead to the same conclusion though.

First interpretation - direct initialization

From the C++14 Standard (Working Draft), [expr.new] 17:

A new-expression that creates an object of type T initializes that object as follows:

(17.1) — If the new-initializer is omitted, the object is default-initialized (8.5). [ Note: If no initialization is performed, the object has an indeterminate value. — end note ]

(17.2) — Otherwise, the new-initializer is interpreted according to the initialization rules of 8.5 for direct initialization.

In our case the new-initializer is present, so (according to 17.2) new A[1]{x} is interpreted using direct initialization rules. Let's look at [dcl.init] 16:

The initialization that occurs in the forms

T x(a);

T x{a};

as well as in new expressions (5.3.4), static_cast expressions (5.2.9), functional notation type conversions (5.2.3), mem-initializers (12.6.2), and the braced-init-list form of a condition is called direct-initialization

Ok, this further confirms that we are dealing with direct initialization. Now let's see how direct initialization works in [dcl.init] 17:

The semantics of initializers are as follows. The destination type is the type of the object or reference being initialized and the source type is the type of the initializer expression. If the initializer is not a single (possibly parenthesized) expression, the source type is not defined.

[... 17.1 through 17.5 omitted ...]

(17.6) — If the destination type is a (possibly cv-qualified) class type:

(17.6.1) — If the initialization is direct-initialization, or if it is copy-initialization where the cv-unqualified version of the source type is the same class as, or a derived class of, the class of the destination, constructors are considered. The applicable constructors are enumerated (13.3.1.3), and the best one is chosen through overload resolution (13.3). The constructor so selected is called to initialize the object, with the initializer expression or expression-list as its argument(s). If no constructor applies, or the overload resolution is ambiguous, the initialization is ill-formed.

According to the excerpt above, when the object being initialized is a class type (as is the case here) and when dealing with direct initialization (as is the case here) the destination object is initialized using the most suitable constructor.

I won't cite the rules about how the constructor is selected, as in this case when there is only the default A::A() constructor and the copy A::A(const A&) constructor, the copy constructor is obviously the better choice when initializing with x of type A. This is the source of one of the copy constructor invocations.

I didn't find any remarks about the initialization of arrays in particular in section [expr.new] and why it should cause a second constructor invocation.

Second interpretation - copy initialization

Here, we can start from [dcl.init.list] 1:

List-initialization is initialization of an object or reference from a braced-init-list. Such an initializer is called an initializer list, and the comma-separated initializer-clauses of the list are called the elements of the initializer list. An initializer list may be empty. List-initialization can occur in direct-initialization or copy initialization contexts; list-initialization in a direct-initialization context is called direct-list-initialization and list-initialization in a copy-initialization context is called copy-list-initialization. [ Note: List-initialization can be used

(1.1) — as the initializer in a variable definition (8.5)

(1.2) — as the initializer in a new-expression (5.3.4)

[... 1.3 through 1.10 omitted ...]

— end note ]

This excerpt can be understood to say that new A[1]{x} is actually a form of list intialization rather than direct initialization as a braced-init-list {x} is used. Assuming this is the case, let's look at how it works in [dcl.init.list] 3:

List-initialization of an object or reference of type T is defined as follows:

[... 3.1 through 3.2 omitted ...]

(3.3) — Otherwise, if T is an aggregate, aggregate initialization is performed (8.5.1).

[... 3.4 through 3.10 omitted ...]

In our case, point 3.3 applies as we are initializing an array which is an aggregate, according to [dcl.init.aggr] 1:

An aggregate is an array or a class (Clause 9) with no user-provided constructors (12.1), no private or protected non-static data members (Clause 11), no base classes (Clause 10), and no virtual functions (10.3).

As such let's look at how aggregate initialization is performed in [dcl.init.aggr] 2:

When an aggregate is initialized by an initializer list, as specified in 8.5.4, the elements of the initializer list are taken as initializers for the members of the aggregate, in increasing subscript or member order. Each member is copy-initialized from the corresponding initializer-clause. If the initializer-clause is an expression and a narrowing conversion (8.5.4) is required to convert the expression, the program is ill-formed.

This fragment tells us that elements are copy initialized. As such y[0] will be copy initialized from x. Now let's look at how copy initialization works in [dcl.init] 17:

The semantics of initializers are as follows. The destination type is the type of the object or reference being initialized and the source type is the type of the initializer expression. If the initializer is not a single (possibly parenthesized) expression, the source type is not defined.

[... 17.1 through 17.5 omitted ...]

(17.6) — If the destination type is a (possibly cv-qualified) class type:

(17.6.1) — If the initialization is direct-initialization, or if it is copy-initialization where the cv-unqualified version of the source type is the same class as, or a derived class of, the class of the destination, constructors are considered. The applicable constructors are enumerated (13.3.1.3), and the best one is chosen through overload resolution (13.3). The constructor so selected is called to initialize the object, with the initializer expression or expression-list as its argument(s). If no constructor applies, or the overload resolution is ambiguous, the initialization is ill-formed.

Just like last time, this initialization fulfills the requirements for point 17.6.1 as it is copy-initialization where the source type (A of x) is the same as the destination type (A of y[0]). This means that in this case the copy constructor will be called as well.

Conclusion

It seems that regardless of which interpretation is chosen, only one constructor should be called and that Clang is right. I was unable to find any evidence that a temporary should be created. For some more example-based evidence, other compilers like icc, and (admittedly clang-based) zapcc and elcc agree with clang, all having only one copy constructor invocation.

I don't know much about g++'s internal workings, but I have a theory about why it does two copy constructor invocations. It is possible that internally g++ uses some helper constructor invocations that are later always optimized out and that the use of the -fno-elide-constructors flag breaks the invariance that they will be always optimized out. This is however pure speculation about g++ on my side, so please correct me if I'm wrong.

answered Sep 22 '22 16:09

janekb04

Related questions
                            
                                Is it possible to disable GCC warning about missing underscore in user defined literal?
                            
                                Remote debugging C++ applications with Eclipse CDT/RSE/RDT
                            
                                How to call a C++ API from C#
                            
                                How to check whether a container is stable
                            
                                Skip some arguments in a C++ function?
                            
                                Why doesn't Qt Creator find included headers in included paths - even though qmake is able to find them
                            
                                OpenMP on a 2-socket system
                            
                                cmake add_custom_command issue with multiple output files
                            
                                Concept of and basic questions about separating logic (C++) and GUI (Qt)
                            
                                Inheriting-Constructors + In-Class-Initialization of non-default constructabe type fails
                            
                                How to correctly "perfect forward" getter functions?
                            
                                Why does QGLWidget only render a blank screen?
                            
                                C / C++ Literals
                            
                                Running C# code from C++ application (Android NDK) for free
                            
                                How to test for trivially copy assignable lambdas
                            
                                Why is this code getting faster when I'm using way more threads than my CPU has cores?
                            
                                boost::asio reasoning behind num_implementations for io_service::strand
                            
                                Using libcurl in a multithreaded environment causes VERY slow performance related to DNS lookup
                            
                                Deprecation of std::allocator<void>
                            
                                When can a base class have a different layout than the corresponding complete object type?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With