Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Union memory share in C

Tags:

c

unions

Edit2: Can I do polymorphism with Union? It seems to me that I can change the data structure based on my need.

Edit: Fix the code. Use "." instead of "->". What I want to ask is, how to make sure the value is stored correctly when there's different data type (like int and char use interchangebly? Since both has different memory size, the one which needs bigger memory space would be allocate memory space for both types of variables to share.

Suppose I have 2 structs:

typedef struct a{
          int a;
}aType;

typedef struct b{
          char b;
}bType;

typedef union{
         aType a_type;
         bType b_type;
}ab;

int main(void){
         ab v1;
         v1.a_type.a = 5;
         v1.b_type.b = 'a'
}

As far as I know, both aType and bType will share the same memory. Since int has 3 bytes greater (int is 4 bytes, and char is 1 byte), it will have 4 memory blocks. The first one is the left most and the last one is the right most. The time I assign 'a' to variable b of v1, it will stay in the first block (the left most) of memory block. The value 5 still remains in the fourth block of memory (the right most).

Therefore, when prints it out, it will produce garbage value, won't it? If so, how to fix this problem? By this problem, which means if I store 'a' into b_type, the share memory must be sure to have that value 'a' only, not the previous integer value 5.

like image 622
Amumu Avatar asked Feb 25 '23 09:02

Amumu


2 Answers

There is no right behavior. Setting a union via one member and retrieving a value from a different member causes undefined behavior. You can do useful things with this technique, but it is very hardware and compiler dependent. You will need to consider processor endianness and memory alignment requirements.

Back when I did almost all my programming in C, there were two (portable) techniques using unions that I relied on pretty heavily.

A tagged union. This is great when you need a dynamically typed variable. You set up a struct with two fields: a type discriminant and a union of all possible types.

struct variant {
  enum { INT, CHAR, FLOAT } type;
  union value {
    int i;
    char c;
    float f;
  };
};

You just had to be very careful to set the type value correctly whenever you changed the union's value and to retrieve only the value specified by the type.

Generic pointers. Since you can be pretty sure that all pointers have the same size and representation, you can create a union of pointer types and know that you can set and retrieve values interchangeably without regard to type:

typedef union {
  void *v;
  int* i;
  char* c;
  float* f;
} ptr;

This is especially useful for (de)serializing binary data:

// serialize
ptr *p;
p.v = ...; // set output buffer
*p.c++ = 'a';
*p.i++ = 12345;
*p.f++ = 3.14159;

// deserialize
ptr *p;
p.v = ...; // set input buffer
char c = *p.c++;
int i = *p.i++;
float f = *p.f++;

FYI: You can make your example simpler. The structs are unnecessary. You'll get the same behavior with this:

int main() {

  union {
    int a;
    char b;
  } v1;

  v1.a = 5;
  v1.b = 'a';
}
like image 197
Ferruccio Avatar answered Feb 27 '23 22:02

Ferruccio


The behavior you describe is platform/system/compiler dependent. On Intel x86 processors, for instance, the 5 is likely to be the first byte in the int for the gcc compiler.

The union interest comes from two main angles

  • share the same space of memory in order to minimize the required memory allocation (in this case, the first byte [for instance] may indicate the type of the data in the structure/union).
  • to analyze some data structure, without the need of using casting and pointers. For instance, a union between a double and a char[8] on some platforms is an easy way to get a per-char/byte view of the double structure.

If there is no benefit in using a union, don't do it.

like image 42
Déjà vu Avatar answered Feb 28 '23 00:02

Déjà vu