Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cast between structs from different scopes

I am interested in casting between pointers to structs that are potentially compatible. They'll use the same tag, the same members in the same order. Although the target codebase is compiled as either C or C++, for the sake of simplifying this question I would like to restrict this to C++ only.

This is a situation where I am confident that the compiler will behave reasonably, but I cannot find supporting evidence that it is required to do so.

The motivating code example is:

#include <cstdio>

void foo(void * arg)
{
    struct example
    {
        int a;
        const char * b;
    };

    example * myarg = static_cast<example *>(arg);
    printf("meaning of %s is %d\n",myarg->b,myarg->a);
}

void bar(void)
{
    struct example
    {
        int a;
        const char * b;
    };

    example on_stack {42, "life"};
    foo(&on_stack);
}

int main(int,char**)
{
    bar();
}

I have had less luck with the C++11 standard. Section 9 on classes suggests the examples will be "layout-compatible", which sounds encouraging, but I can't find a description of the consequences of structures being "layout-compatible". In particular, can I cast a pointer of one to a pointer of the other without consequences?

A colleague believes "layout-compatible" means memcpy will work as expected. Given that the struct in question is also always trivially copyable, it is possible that the following nominally inefficient code would avoid UB:

#include <cstdio>
#include <cstring>

void foo(void * arg)
{
    struct example
    {
        int a;
        const char * b;
    };

    example local;
    std::memcpy(&local, arg, sizeof(example));
    printf("meaning of %s is %d\n", local.b, local.a);
}

// bar and main as before

The actual motivation for this is to get the struct definition out of global scope when it's only used for communication between a small number of functions. I appreciate that it is debatable whether this is a good idea.

like image 257
Jon Chesterfield Avatar asked May 19 '16 12:05

Jon Chesterfield


2 Answers

Does [basic.lval] 10.6 allow aliasing between layout compatible types? No. The section in question states:

an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union)

Recall that "the aforementioned types" are the actual type T, the dynamic type of T, a type that is similar to the dynamic type, some const/volatile qualified version of the dynamic type, or a signed/unsigned version of the dynamic type.

Now, consider this code:

struct T {int i;};
struct U {int i;};

T t;
U *pu = (U*)&t;
pu->i = 5;

Now, let's look at 10.6 in that light. The question 10.6 asks is if the glvalue's type U contains a member that fits the qualifications of 10.1-10.5. Does it? Remember that the dynamic type of the object t is T.

  • Does U contain a member of type T? No.
  • Does U contain a member which is a const/volatile qualified version of T? No.
  • Does U contain a member that is of a type which is similar to T? No.
  • Does U contain a member that is a signed/unsigned version of T? No.
  • Does Ucontain a member that is a const/volatile qualified version of a signed/unsigned version of T? No.

Since all of those fail, the compiler is allowed to assume that modifying the object pointed to by pu will not modify the object t.


FYI:

Anyway, memcopy and pointer aliasing is exactly the same, except for global struct alignment.

No, they aren't. The rules for trivial copy-ability and layout compatibility are not at all the same as the rules for aliasing.

Trivial copyability is about the sanity of copying the value representation of an object and whether such a copy represents a legitimate object. The rules of layout compatibility are about whether the value representation of A is compatible with B, such that a value of A could be copied into an object of type B.

Aliasing is about saying whether it is possible to access an object through a pointer/reference to A and a pointer/reference to B at the same time. The strict aliasing rule states that if the compiler sees a A& a and a B& b, the compiler is allowed to assume that modifications made through a will not affect the object referenced through b, and vice-versa. [basic.lval] 10 outlines the cases when the compiler is not allowed to assume this.

like image 52
Nicol Bolas Avatar answered Sep 22 '22 17:09

Nicol Bolas


It is now clear (thanks to Nicol Bolas's answer) that direct aliasing between two structs that are simply layout compatible would invoke UB because of the strict aliasing rule.

Of course you can memcopy the content, but:

  • it may be expensive depending of struct size
  • you only get a copy (changes will not be reflected) unless you memcopy back when done

But... you can create in C++ a struct of references that point to the original values. It will aliases directly members to their original type which is now perfectly defined by the standard.

Code for foo could become:

void foo(void * arg)
{
    struct example // only used to declare the layout
    {
        int a;
        const char * b;
    };
    struct r_example {
    int &a;
    const char *&b;
    r_example(void *ext): a(*(static_cast<int*>(ext))),
        b(*(reinterpret_cast<const char **>(
            static_cast<char*>(ext) + offsetof(example, b)))) {}
    };


    r_example myarg(arg);
    printf("in foo meaning of %s is %d\n",myarg.b,myarg.a);
    myarg.a /= 2;
}

And the change introduced in last line is visible without UB in caller:

void bar(void)
{
    struct example
    {
        int a;
        const char * b;
    };

    example on_stack {42, "life"};
    foo(&on_stack);
    printf("after foo meaning of %s is %d\n",on_stack.b,on_stack.a);
}

Will display:

in foo meaning of life is 42
after foo meaning of life is 21

The C counterpart, will use pointers instead of refs:

    struct p_example {
        int *a;
        const char **b;
    } my_arg;
    my_arg.a = (int *) ext;
    my_arg.b = (const char **)(((char*)ext) + offsetof(example, b));

    printf("in foo meaning of %s is %d\n",*(myarg.b),*(myarg.a));
    *(myarg.a) /= 2;
like image 29
Serge Ballesta Avatar answered Sep 22 '22 17:09

Serge Ballesta