Here is the code:
unsigned int a; // a is indeterminate
unsigned long long b = 1; // b is initialized to 1
std::memcpy(&a, &b, sizeof(unsigned int));
unsigned int c = a; // Is this not undefined behavior? (Implementation-defined behavior?)
Is a
guaranteed by the standard to be a determinate value where we access it to initialize c
? Cppreference says:
void* memcpy( void* dest, const void* src, std::size_t count );
Copies
count
bytes from the object pointed to bysrc
to the object pointed to bydest
. Both objects are reinterpreted as arrays ofunsigned char
.
But I don't see anywhere in cppreference that says if an indeterminate value is "copied to" like this, it becomes determinate.
From the standard, it seems it's analogous to this:
unsigned int a; // a is indeterminate
unsigned long long b = 1; // b is initialized to 1
auto* a_ptr = reinterpret_cast<unsigned char*>(&a);
auto* b_ptr = reinterpret_cast<unsigned char*>(&b);
a_ptr[0] = b_ptr[0];
a_ptr[1] = b_ptr[1];
a_ptr[2] = b_ptr[2];
a_ptr[3] = b_ptr[3];
unsigned int c = a; // Is this undefined behavior? (Implementation defined behavior?)
It seems like the standard leaves room for this to be allowed, because the type aliasing rules allow for the object a
to be accessed as an unsigned char
this way. But I can't find something that says this makes a
no longer indeterminate.
Is this not undefined behavior
It's UB, because you're copying into the wrong type. [basic.types]2 and 3 permit byte copying, but only between objects of the same type. You copied from a long long
into an int
. That has nothing to do with the value being indeterminate. Even though you're only copying sizeof(int)
bytes, the fact that you're not copying from an actual int
means that you don't get the protection of those rules.
If you were copying into the value of the same type, then [basic.types]3 says that it's equivalent to simply assigning them. That is, a
" shall subsequently hold the same value as" b
.
TL;DR: It's implementation-defined whether there will be undefined behavior or not. Proof-style, with lines of code numbered:
unsigned int a;
The variable a
is assumed to have automatic storage duration. Its lifetime begins (6.6.3/1). Since it is not a class, its lifetime begins with default initialization, in which no other initialization is performed (9.3/7.3).
unsigned long long b = 1ull;
The variable b
is assumed to have automatic storage duration. Its lifetime begins (6.6.3/1). Since it is not a class, its lifetime begins with copy-initialization (9.3/15).
std::memcpy(&a, &b, sizeof(unsigned int));
Per 16.2/2, std::memcpy
should have the same semantics and preconditions as the C standard library's memcpy
. In the C standard 7.21.2.1, assuming sizeof(unsigned int) == 4
, 4 characters are copied from the object pointed to by &b
into the object pointed to by &a
. (These two points are what is missing from other answers.)
At this point, the sizes of unsigned int
, unsigned long long
, their representations (e.g. endianness), and the size of a character are all implementation defined (to my understanding, see 6.7.1/4 and its note saying that ISO C 5.2.4.2.1 applies). I will assume that the implementation is little-endian, unsigned int
is 32 bits, unsigned long long
is 64 bits, and a character is 8 bits.
Now that I have said what the implementation is, I know that a
has a value-representation for an unsigned int
of 1u. Nothing, so far, has been undefined behavior.
unsigned int c = a;
Now we access a
. Then, 6.7/4 says that
For trivially copyable types, the value representation is a set of bits in the object representation that determines a value, which is one discrete element of an implementation-defined set of values.
I know now that the value of a
is determined by the implementation-defined value bits in a
, which I know hold the value-representation for 1u. The value of a
is then 1u.
Then like (2), the variable c
is copy-initialized to 1u.
We made use of implementation-defined values to find what happens. It is possible that the implementation-defined value of 1ull is not one of the implementation-defined set of values for unsigned int
. In that case, accessing a
will be undefined behavior, because the standard doesn't say what happens when you access a variable with a value-representation that is invalid.
AFAIK, we can take advantage of the fact that most implementations define an unsigned int
where any possible bit pattern is a valid value-representation. Therefore, there will be no undefined behavior.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With