Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ Undefined behaviour with unions

Tags:

c++

Was just reading about some anonymous structures and how it is isn't standard and some general use case for it is undefined behaviour...

This is the basic case:

struct Point {
    union {
       struct {
           float x, y;
       };
       float v[2];
    };
};

So writing to x and then reading from v[0] would be undefined in that you would expect them to be the same but it may not be so.

Not sure if this is in the standard but unions of the same type...

union{ float a; float b; };

Is it undefined to write to a and then read from b ?

That is to say does the standard say anything about binary representation of arrays and sequential variables of the same type.

like image 551
johndoe Avatar asked Jun 24 '13 10:06

johndoe


People also ask

What does undefined behavior mean in C?

When we run a code, sometimes we see absurd results instead of expected output. So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended.

Why does C have so much undefined behavior?

It exists because of the syntax rules of C where a variable can be declared without init value. Some compilers assign 0 to such variables and some just assign a mem pointer to the variable and leave just like that. if program does not initialize these variables it leads to undefined behavior.

Is type punning undefined behavior?

In C, type-punning is NOT part of the language. Its technically "undefined behavior". There's an expectation that when you do: short foo[2] = {1, 2} ; *(int*)foo = 0x12345678; assert(foo[0] == 0x5678 && foo[1] == 0x1234);

What is the point of union C?

C unions allow data members which are mutually exclusive to share the same memory. This is quite important when memory is valuable, such as in embedded systems. Unions are mostly used in embedded programming where direct access to the memory is needed.


1 Answers

The standard says that reading from any element in a union other than the last one written is undefined behavior. In theory, the compiler could generate code which somehow kept track of the reads and writes, and triggered a signal if you violated the rule (even if the two are the same type). A compiler could also use the fact for some sort of optimization: if you write to a (or x), it can assume that you do not read b (or v[0]) when optimizing.

In practice, every compiler I know supports this, if the union is clearly visible, and there are cases in many (most?, all?) where even legal use will fail if the union is not visible (e.g.:

union  U { int i; float f; };

int f( int* pi, int* pf ) { int r = *pi; *pf = 3.14159; return r; }

//  ...
U u;
u.i = 1;
std::cout << f( &u.i, &u.f );

I've actually seen this fail with g++, although according to the standard, it is perfectly legal.)

Also, even if the compiler supports writing to Point::x and reading from Point::v[0], there's no guarantee that Point::y and Point::v[1] even have the same physical address.

like image 101
James Kanze Avatar answered Sep 17 '22 14:09

James Kanze