#include<cstring>
struct A {
char a;
int b;
};
int main() {
A* a = new A();
a->a = 1;
unsigned char m[sizeof(A)];
std::memcpy(m, a, sizeof(A));
return m[1];
}
Is this program guaranteed to exit with status 0 in C++, aside from possible exceptions due to allocation failure and assuming there is at least one padding byte between a and b in A?
new A() does value-initialization which, because A's default constructor is trivial, but neither user-provided nor deleted, zeros all members and padding bytes of the A object.
For C, 6.2.6.1p6 in N1570 (C11 draft) seemed to imply to me that padding bytes are in an unspecified state after assignment to a member, although I may be misinterpreting this (see comments).
But in any case I don't see any rule allowing this in the C++ standard (drafts). On the other hand it isn't specified that padding should be stable either. And e.g. a comment in CWG issue 2536 claims that in C++ padding is never stable at all.
Motivated by this stating that the padding from a zero-initialized structure may leak information if followed by assignment to a member in the second (non-compliant) example. Note however that the description of that example is wrong anyway since it actually does aggregate-initialization, not value-initialization and therefore no zero-initialization.
Here are two similar versions of the code that I had in the question earlier, but which probably have UB due to unrelated issues with the method I use to inspect the object representation (see comments):
#include<new>
struct A {
char a;
int b;
};
int main() {
unsigned char* m = new unsigned char[sizeof(A)];
A* a = new(m) A();
a->a = 1;
return m[1];
}
and
struct A {
char a;
int b;
};
int main() {
A* a = new A();
a->a = 1;
return reinterpret_cast<unsigned char*>(a)[1];
}
First, dynamically allocated memory (i.e. variables like "A* a = new A()" in your main function) is not automatically set to zero in C/C++. It usually contains garbage from previous variables. Since your variable is declared at the very beginning of the program, you are lucky that it is null.
For example, if you change the code like this, then the a->b field will contain garbage:
int main() {
volatile int* garbage = new int(12345678); // allocate memory on the heap and initialize it with the value 12345678
std::cout << (*garbage) << std::endl; // use the variable so that the compiler does not remove it when optimizing
delete garbage;
A* a = new A(); // allocate memory in the same place
a->a = 1;
std::cout << a->b << std::endl; // will output the garbage from the garbage variable
...
In order to avoid such undefined (random) behavior, you need to add a constructor for the structure, which will be called when it is created:
A()
{
memset(this, 0, sizeof(A)); // explicit nulling
}
Finally, in order to get a guaranteed result (WISIWIG) when placing structure fields in memory, it is necessary to instruct the compiler not to align them (fields) on the machine word boundary to speed up reading / writing. The format of the directive is compiler dependent: C++ struct alignment question
Here is a complete example program for the GCC compiler:
#include <cstring>
#include <iostream>
#include <cassert>
// to disable field alignment, you must include this directive in the description of the structure
#define __PACKED__ __attribute__((packed))
struct A {
A()
{
memset(this, 0, sizeof(A));
}
char a;
int b;
}__PACKED__; // now it is a packed structure (no alignment inserts between fields)
int main() {
volatile int* garbage = new int(12345678);
std::cout << (*garbage) << std::endl;
delete garbage;
A* a = new A();
a->a = 1;
std::cout << a->b << std::endl; // now here's 0
unsigned char m[sizeof(A)];
std::memcpy(m, a, sizeof(A));
assert(m[0] == 1);
assert(*(int*)(m+1) == 0);
assert(sizeof(A) == sizeof(char)+sizeof(int));
std::cout << "return value: " << int(m[1]) << std::endl;
return m[1];
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With