Let's say you have an object of type <code>T</code> and a suitably-aligned memory buffer <code>alignas(T) unsigned char[sizeof(T)]</code>. If you use <code>std::memcpy</code> to copy from the object of type <code>T</code> to the <code>unsigned char</code> array, is that considered copy construction or copy-assignment? If a type is trivially-copyable but not standard-layout, it is conceivable that a class such as this: <pre class="prettyprint lang-cpp prettyprint-override"><code>struct Meow { int x; protected: // different access-specifier means not standard-layout int y; }; </code></pre> could be implemented like this, because the compiler isn't forced into using standard-layout: <pre class="prettyprint lang-cpp prettyprint-override"><code>struct Meow_internal { private: ptrdiff_t x_offset; ptrdiff_t y_offset; unsigned char buffer[sizeof(int) * 2 + ANY_CONSTANT]; }; </code></pre> The compiler could store <code>x</code> and <code>y</code> of Meow within buffer at any portion of <code>buffer</code>, possibly even at a random offset within <code>buffer</code>, so long as they are aligned properly and do not overlap. The offset of <code>x</code> and <code>y</code> could even vary randomly with each construction if the compiler wishes. (<code>x</code> could go after <code>y</code> if the compiler wishes because the Standard only requires members of the same access-specifier to go in order, and <code>x</code> and <code>y</code> have different access-specifiers.) This would meet the requirements of being trivially-copyable; a <code>memcpy</code> would copy the hidden offset fields, so the new copy would work. But some things would not work. For example, holding a pointer to <code>x</code> across a <code>memcpy</code> would break: <pre class="prettyprint lang-cpp prettyprint-override"><code>Meow a; a.x = 2; a.y = 4; int *px = &a.x; Meow b; b.x = 3; b.y = 9; std::memcpy(&a, &b, sizeof(a)); ++*px; // kaboom </code></pre> However, is the compiler really allowed to implement a trivially-copyable class in this manner? Dereferencing <code>px</code> should only be undefined behavior if <code>a.x</code>'s lifetime has ended. Has it? The relevant portions of the N3797 draft Standard aren't very clear on the subject. This is section [basic.life]/1: <blockquote> The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. — end note ] The lifetime of an object of type <code>T</code> begins when: <ul> <li>storage with the proper alignment and size for type <code>T</code> is obtained, and</li> <li>if the object has non-trivial initialization, its initialization is complete.</li> </ul> The lifetime of an object of type <code>T</code> ends when: <ul> <li>if <code>T</code> is a class type with a non-trivial destructor ([class.dtor]), the destructor call starts, or</li> <li>the storage which the object occupies is reused or released.</li> </ul> </blockquote> And this is [basic.types]/3: <blockquote> For any object (other than a base-class subobject) of trivially copyable type <code>T</code>, whether or not the object holds a valid value of type <code>T</code>, the underlying bytes ([intro.memory]) making up the object can be copied into an array of <code>char</code> or <code>unsigned char</code>. If the content of the array of <code>char</code> or <code>unsigned char</code> is copied back into the object, the object shall subsequently hold its original value. example omitted </blockquote> The question then becomes, is a <code>memcpy</code> overwrite of a trivially-copyable class instance "copy construction" or "copy-assignment"? The answer to the question seems to decide whether <code>Meow_internal</code> is a valid way for a compiler to implement trivially-copyable class <code>Meow</code>. If <code>memcpy</code> is "copy construction", then the answer is that <code>Meow_internal</code> is valid, because copy construction is reusing the memory. If <code>memcpy</code> is "copy-assignment", then the answer is that <code>Meow_internal</code> is not a valid implementation, because assignment does not invalidate pointers to the instantiated members of a class. If <code>memcpy</code> is both, I have no idea what the answer is.

It is clear to me that using <code>std::memcpy</code> results in neither construction nor assignment. It is not construction, since no constructor will be called. Nor is it assignment, as the assignment operator will not be called. Given that a trivially copyable object has trivial destructors, (copy/move) constructors, and (copy/move) assignment operators, the point is rather moot. You seem to have quoted ¶2 from §3.9 [basic.types]. On ¶3, it states: <blockquote> For any trivially copyable type <code>T</code>, if two pointers to <code>T</code> point to distinct <code>T</code> objects <code>obj1</code> and <code>obj2</code>, where neither <code>obj1</code> nor <code>obj2</code> is a base-class subobject, if the underlying bytes (1.7) making up <code>obj1</code> are copied into <code>obj2</code>,41<code>obj2</code> shall subsequently hold the same value as <code>obj1</code>. [ Example: <code>T* t1p;</code> <code>T* t2p;</code> // provided that <code>t2p</code> points to an initialized object ... <code>std::memcpy(t1p, t2p, sizeof(T));</code> // at this point, every subobject of trivially copyable type in <code>*t1p</code> contains // the same value as the corresponding subobject in <code>*t2p</code> — end example ] 41) By using, for example, the library functions (17.6.1.2) <code>std::memcpy</code> or <code>std::memmove</code>. </blockquote> Clearly, the standard intended to allow <code>*t1p</code> to be useable in every way <code>*t2p</code> would be. Continuing on to ¶4: <blockquote> The object representation of an object of type <code>T</code> is the sequence of N unsigned char objects taken up by the object of type <code>T</code>, where N equals <code>sizeof(T)</code>. The value representation of an object is the set of bits that hold the value of type <code>T</code>. For trivially copyable types, the value representation is a set of bits in the object representation that determines a value, which is one discrete element of an implementation-defined set of values.42 42) The intent is that the memory model of C++ is compatible with that of ISO/IEC 9899 Programming Language C. </blockquote> The use of the word the in front of both defined terms implies that any given type only has one object representation and a given object has only one value representation. Your hypothetical morphing internal type should not exist. The footnote makes it clear that the intention is for trivially copyable types to have a memory layout compatible with C. The expectation is then that even an object with non-standard layout, copying it around will still allow it to be useable.

In the same draft, you also find the following text, directly following the text you quoted: <blockquote> For any trivially copyable type <code>T</code>, if two pointers to <code>T</code> point to distinct <code>T</code> objects <code>obj1</code> and <code>obj2</code>, where neither <code>obj1</code> nor <code>obj2</code> is a base-class subobject, if the underlying bytes (1.7) making up <code>obj1</code> are copied into <code>obj2</code>, <code>obj2</code> shall subsequently hold the same value as <code>obj1</code>. </blockquote> Note that this speaks about a change of the value of <code>obj2</code>, not about destroying the object <code>obj2</code> and creating a new object in its place. Since not the object, but only its value is changed, any pointers or references to its members should therefore remain valid.

Is memcpy of a trivially-copyable type construction or assignment?

Tags:

c++

copy-constructor

c++11

language-lawyer

memcpy

Let's say you have an object of type T and a suitably-aligned memory buffer alignas(T) unsigned char[sizeof(T)]. If you use std::memcpy to copy from the object of type T to the unsigned char array, is that considered copy construction or copy-assignment?

If a type is trivially-copyable but not standard-layout, it is conceivable that a class such as this:

struct Meow
{
    int x;
protected: // different access-specifier means not standard-layout
    int y;
};

could be implemented like this, because the compiler isn't forced into using standard-layout:

struct Meow_internal
{
private:
    ptrdiff_t x_offset;
    ptrdiff_t y_offset;
    unsigned char buffer[sizeof(int) * 2 + ANY_CONSTANT];
};

The compiler could store x and y of Meow within buffer at any portion of buffer, possibly even at a random offset within buffer, so long as they are aligned properly and do not overlap. The offset of x and y could even vary randomly with each construction if the compiler wishes. (x could go after y if the compiler wishes because the Standard only requires members of the same access-specifier to go in order, and x and y have different access-specifiers.)

This would meet the requirements of being trivially-copyable; a memcpy would copy the hidden offset fields, so the new copy would work. But some things would not work. For example, holding a pointer to x across a memcpy would break:

Meow a;
a.x = 2;
a.y = 4;
int *px = &a.x;

Meow b;
b.x = 3;
b.y = 9;
std::memcpy(&a, &b, sizeof(a));

++*px; // kaboom

However, is the compiler really allowed to implement a trivially-copyable class in this manner? Dereferencing px should only be undefined behavior if a.x's lifetime has ended. Has it? The relevant portions of the N3797 draft Standard aren't very clear on the subject. This is section [basic.life]/1:

The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. — end note ] The lifetime of an object of type T begins when:

storage with the proper alignment and size for type T is obtained, and

if the object has non-trivial initialization, its initialization is complete.

The lifetime of an object of type T ends when:

if T is a class type with a non-trivial destructor ([class.dtor]), the destructor call starts, or

the storage which the object occupies is reused or released.

And this is [basic.types]/3:

For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes ([intro.memory]) making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value. example omitted

The question then becomes, is a memcpy overwrite of a trivially-copyable class instance "copy construction" or "copy-assignment"? The answer to the question seems to decide whether Meow_internal is a valid way for a compiler to implement trivially-copyable class Meow.

If memcpy is "copy construction", then the answer is that Meow_internal is valid, because copy construction is reusing the memory. If memcpy is "copy-assignment", then the answer is that Meow_internal is not a valid implementation, because assignment does not invalidate pointers to the instantiated members of a class. If memcpy is both, I have no idea what the answer is.

235

asked Oct 03 '14 00:10

Myria

2 Answers

It is clear to me that using std::memcpy results in neither construction nor assignment. It is not construction, since no constructor will be called. Nor is it assignment, as the assignment operator will not be called. Given that a trivially copyable object has trivial destructors, (copy/move) constructors, and (copy/move) assignment operators, the point is rather moot.

You seem to have quoted ¶2 from §3.9 [basic.types]. On ¶3, it states:

For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making up obj1 are copied into obj2,⁴¹obj2 shall subsequently hold the same value as obj1. [ Example:
  T* t1p;
  T* t2p;
          // provided that t2p points to an initialized object ...
  std::memcpy(t1p, t2p, sizeof(T));
          // at this point, every subobject of trivially copyable type in *t1p contains
          // the same value as the corresponding subobject in *t2p
— end example ]
_{41) By using, for example, the library functions (17.6.1.2) std::memcpy or std::memmove.}

Clearly, the standard intended to allow *t1p to be useable in every way *t2p would be.

Continuing on to ¶4:

The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T). The value representation of an object is the set of bits that hold the value of type T. For trivially copyable types, the value representation is a set of bits in the object representation that determines a value, which is one discrete element of an implementation-defined set of values.⁴²
_{42) The intent is that the memory model of C++ is compatible with that of ISO/IEC 9899 Programming Language C.}

The use of the word the in front of both defined terms implies that any given type only has one object representation and a given object has only one value representation. Your hypothetical morphing internal type should not exist. The footnote makes it clear that the intention is for trivially copyable types to have a memory layout compatible with C. The expectation is then that even an object with non-standard layout, copying it around will still allow it to be useable.

148

answered Oct 16 '22 09:10

jxh

In the same draft, you also find the following text, directly following the text you quoted:

For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making up obj1 are copied into obj2, obj2 shall subsequently hold the same value as obj1.

Note that this speaks about a change of the value of obj2, not about destroying the object obj2 and creating a new object in its place. Since not the object, but only its value is changed, any pointers or references to its members should therefore remain valid.

answered Oct 16 '22 10:10

celtschk

Related questions
                            
                                Why does this output of the same expression from printf differ from cout?
                            
                                How do you make linux GUI's?
                            
                                What advantages can I get from learning C++ if I'm mainly a C# Programmer? [closed]
                            
                                Are do-while-false loops common?
                            
                                What is a good example of recursion other than generating a Fibonacci sequence?
                            
                                Safety of std::unordered_map::merge()
                            
                                Menubar + Commandbar on WM 5.0 and WM 6.5.3
                            
                                Why eigenvector & eigenvalue in LDA become zero?
                            
                                Handle event callbacks with Luabind
                            
                                Boost::Python, converting tuple to Python works, vector<tuple> does not
                            
                                Problems when scaling a YUV image using libyuv library
                            
                                Is there a gcc option to assume all extern "C" functions cannot propagate exceptions?
                            
                                How to indent after access modifiers with clang-format
                            
                                Using std::array and using "array" as name
                            
                                Should the implementation guard itself against comma overloading?
                            
                                How can I link with (or work around) two third-party static libraries that define the same symbols?
                            
                                Is the Visual C++ implementation of std::async using a thread pool legal
                            
                                Xcode refuses to build one of my OpenCL projects but builds another one successfully
                            
                                Fixed-width Floating-Point Numbers in C/C++
                            
                                Can GCC be coerced to generate efficient constructors for memory-aligned objects?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With