Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it legal and well defined behavior to use a union for conversion between two structs with a common initial sequence (see example)?

Tags:

c++

c

unions

c99

c89

I have an API with a publicly facing struct A and an internal struct B and need to be able to convert a struct B into a struct A. Is the following code legal and well defined behavior in C99 (and VS 2010/C89) and C++03/C++11? If it is, please explain what makes it well-defined. If it's not, what is the most efficient and cross-platform means for converting between the two structs?

struct A {
  uint32_t x;
  uint32_t y;
  uint32_t z;
};

struct B {
  uint32_t x;
  uint32_t y;
  uint32_t z;
  uint64_t c;
};

union U {
  struct A a;
  struct B b;
};

int main(int argc, char* argv[]) {
  U u;
  u.b.x = 1;
  u.b.y = 2;
  u.b.z = 3;
  u.b.c = 64;

  /* Is it legal and well defined behavior when accessing the non-write member of a union in this case? */
  DoSomething(u.a.x, u.a.y, u.a.z);

  return 0;
}


UPDATE

I simplified the example and wrote two different applications. One based on memcpy and the other using a union.


Union:

struct A {
  int x;
  int y;
  int z;
};

struct B {
  int x;
  int y;
  int z;
  long c;
};

union U {
  struct A a;
  struct B b;
};

int main(int argc, char* argv[]) {
  U u;
  u.b.x = 1;
  u.b.y = 2;
  u.b.z = 3;
  u.b.c = 64;
  const A* a = &u.a;
  return 0;
}


memcpy:

#include <string.h>

struct A {
  int x;
  int y;
  int z;
};

struct B {
  int x;
  int y;
  int z;
  long c;
};

int main(int argc, char* argv[]) {
  B b;
  b.x = 1;
  b.y = 2;
  b.z = 3;
  b.c = 64;
  A a;
  memcpy(&a, &b, sizeof(a));
  return 0;
}



Profiled Assembly [DEBUG] (Xcode 6.4, default C++ compiler):

Here is the relevant difference in the assembly for debug mode. When I profiled the release builds there was no difference in the assembly.


Union:

movq     %rcx, -48(%rbp)


memcpy:

movq    -40(%rbp), %rsi
movq    %rsi, -56(%rbp)
movl    -32(%rbp), %edi
movl    %edi, -48(%rbp)



Caveat:

The example code based on union produces a warning regarding variable 'a' being unused. As the profiled assembly is from debug, I don't know if there is any impact.

like image 872
Coder Avatar asked Jul 22 '15 03:07

Coder


1 Answers

This is fine, because the members you are accessing are elements of a common initial sequence.

C11 (6.5.2.3 Structure and union members; Semantics):

[...] if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

C++03 ([class.mem]/16):

If a POD-union contains two or more POD-structs that share a common initial sequence, and if the POD-union object currently contains one of these POD-structs, it is permitted to inspect the common initial part of any of them. Two POD-structs share a common initial sequence if corresponding members have layout-compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

Other versions of the two standards have similar language; since C++11 the terminology used is standard-layout rather than POD.


I think the confusion may have arisen because C permits type-punning (aliasing a member of a different type) via a union where C++ does not; this is the main case where to ensure C/C++ compatibility you would have to use memcpy. But in your case the elements you are accessing have the same type and are preceded by members of compatible types, so the type-punning rule is not relevant.

like image 192
ecatmur Avatar answered Sep 30 '22 18:09

ecatmur