Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strict aliasing rule

I'm reading notes about reinterpret_cast and it's aliasing rules ( http://en.cppreference.com/w/cpp/language/reinterpret_cast ).

I wrote that code:

struct A
{
  int t;
};

char *buf = new char[sizeof(A)];

A *ptr = reinterpret_cast<A*>(buf);
ptr->t = 1;

A *ptr2 = reinterpret_cast<A*>(buf);
cout << ptr2->t;

I think these rules doesn't apply here:

  • T2 is the (possibly cv-qualified) dynamic type of the object
  • T2 and T1 are both (possibly multi-level, possibly cv-qualified at each level) pointers to the same type T3 (since C++11)
  • T2 is an aggregate type or a union type which holds one of the aforementioned types as an element or non-static member (including, recursively, elements of subaggregates and non-static data members of the contained unions): this makes it safe to cast from the first member of a struct and from an element of a union to the struct/union that contains it.
  • T2 is the (possibly cv-qualified) signed or unsigned variant of the dynamic type of the object
  • T2 is a (possibly cv-qualified) base class of the dynamic type of the object
  • T2 is char or unsigned char

In my opinion this code is incorrect. Am I right? Is code correct or not?

On the other hand what about connect function (man 2 connect) and struct sockaddr?

   int connect(int sockfd, const struct sockaddr *addr,
               socklen_t addrlen);

Eg. we have struct sockaddr_in and we have to cast it to struct sockaddr. Above rules also doesn't apply, so is this cast incorrect?

like image 615
Adam Avatar asked Jul 24 '15 16:07

Adam


2 Answers

Yeah, it's invalid, but not because you're converting a char* to an A*: it's because you are not obtaining a A* that actually points to an A* and, as you've identified, none of the type aliasing options fit.

You'd need something like this:

#include <new>
#include <iostream>

struct A
{
  int t;
};

char *buf = new char[sizeof(A)];

A* ptr = new (buf) A;
ptr->t = 1;

// Also valid, because points to an actual constructed A!
A *ptr2 = reinterpret_cast<A*>(buf);
std::cout << ptr2->t;

Now type aliasing doesn't come into it at all (though keep reading because there's more to do!).

  • (live demo with -Wstrict-aliasing=2)

In reality, this is not enough. We must also consider alignment. Though the above code may appear to work, to be fully safe and whatnot you will need to placement-new into a properly-aligned region of storage, rather than just a casual block of chars.

The standard library (since C++11) gives us std::aligned_storage to do this:

using Storage = std::aligned_storage<sizeof(A), alignof(A)>::type;
auto* buf = new Storage;

Or, if you don't need to dynamically allocate it, just:

Storage data;

Then, do your placement-new:

new (buf) A();
// or: new(&data) A();

And to use it:

auto ptr = reinterpret_cast<A*>(buf);
// or: auto ptr = reinterpret_cast<A*>(&data);

All in it looks like this:

#include <iostream>
#include <new>
#include <type_traits>

struct A
{
  int t;
};

int main()
{
    using Storage = std::aligned_storage<sizeof(A), alignof(A)>::type;

    auto* buf = new Storage;
    A* ptr = new(buf) A();

    ptr->t = 1;

    // Also valid, because points to an actual constructed A!
    A* ptr2 = reinterpret_cast<A*>(buf);
    std::cout << ptr2->t;
}

(live demo)

Even then, since C++17 this is somewhat more complicated; see the relevant cppreference pages for more information and pay attention to std::launder.

Of course, this whole thing appears contrived because you only want one A and therefore don't need array form; in fact, you'd just create a bog-standard A in the first place. But, assuming buf is actually larger in reality and you're creating an allocator or something similar, this makes some sense.

like image 194
Lightness Races in Orbit Avatar answered Oct 13 '22 18:10

Lightness Races in Orbit


The C aliasing rules from which the rules of C++ were derived included a footnote specifying that the purpose of the rules was to say when things may alias. The authors of the Standard didn't think it necessary to forbid implementations from applying the rules in needlessly restrictive fashion in cases where things don't alias, because they thought compiler writers would honor the proverb "Don't prevent the programmer from doing what needs to be done", which the authors of the Standard viewed as part of the Spirit of C.

Situations where it would be necessary to use an lvalue of an aggregate's member type to actually alias a value of the aggregate type are rare, so it's entirely reasonable that the Standard doesn't require compilers to recognize such aliasing. Applying the rules restrictively in cases that don't involve aliasing, however, would cause something like:

union foo {int x; float y;} foo;
int *p = &foo.x;
*p = 1;

or even, for that matter,

union foo {int x; float y;} foo;
foo.x = 1;

to invoke UB since the assignment is used to access the stored values of a union foo and a float using an int, which is not one of the allowed types. Any quality compiler, however, should be able to recognize that an operation done on an lvalue which is visibly freshly derived from a union foo is an access to a union foo, and an access to a union foo is allowed to affect the stored values of its members (like the float member in this case).

The authors of the Standard probably declined to make the footnote normative because doing so would require a formal definition of when an access via freshly-derived lvalue is an access to the parent, and what kinds of access patterns constitute aliasing. While most cases would be pretty clear cut, there are some corner cases which implementations intended for low-level programming should probably interpret more pessimistically than those intended for e.g. high-end number crunching, and the authors of the Standard figured that anyone who could figure out how to handle the harder cases should be able to handle the easy ones.

like image 36
supercat Avatar answered Oct 13 '22 17:10

supercat