Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Structure within union in C

When a variable is associated with a union, the compiler allocates the memory by considering the size of the largest memory. So, size of union is equal to the size of largest member. so it means Altering the value of any of the member will alter other member values. but when I am executing the following code,

output: 4 5 7.000000
union job
{
    int a;
    struct data
    {
        double b;
        int x
    }q;
} w;


w.q.b=7;
w.a=4;
w.q.x=5;

printf("%d %d %f",w.a,w.q.x,w.q.b);
return 0;
}

Issue is that, first I assign the value of a and later modified the value of q.x, then the value of a would be overridden by q.x. But in the output it still shows the original value of a as well as of q.x. I am not able to understand why it is happening?

like image 721
Divyadeep Bhalla Avatar asked Jan 11 '23 08:01

Divyadeep Bhalla


1 Answers

Your understanding is correct - the numbers should change. I took your code, and added a little bit more, to show you exactly what is going on.

The real issue is quite interesting, and has to do with the way floating point numbers are represented in memory.

First, let's create a map of the bytes used in your struct:

aaaa
bbbbbbbbxxxx

As you can see, the first four bytes of b overlap with a. This will turn out to be important.

Now we have to take a look at the way double is typically stored (I am writing this from the perspective of a Mac, with 64 bit Intel architecture. It so happens that the format in memory is indeed the IEEE754 format):

enter image description here

The important thing to note here is that Intel machines are "little endian" - that is, the number that will be stored first is the "thing on the right", i.e. the least significant bits of the "fraction".

Now let's look at a program that does the same thing that your code did - but prints out the contents of the structure so we see what is happening:

#include <stdio.h>
#include <string.h>

void dumpBytes(void *p, int n) {
  int ii;
  char hex[9];
  for(ii = 0; ii < n; ii++) {
    sprintf(hex, "%02x", (char)*((char*)p + ii));
    printf("%s ", hex + strlen(hex)-2);
  }
  printf("\n");
}

int main(void) {
static union job
{
    int a;
    struct data
    {
        double b;
        int x;
    }q;
} w;


printf("intial value:\n");
dumpBytes(&w, sizeof(w));
w.q.b=7;
printf("setting w.q.b = 7:\n");
dumpBytes(&w, sizeof(w));
w.a=4;
printf("setting w.a = 4:\n");
dumpBytes(&w, sizeof(w));
w.q.x=5;
printf("setting w.q.x = 5:\n");
dumpBytes(&w, sizeof(w));

printf("values are now %d %d %.15lf\n",w.a,w.q.x,w.q.b);
w.q.b=7;
printf("setting w.q.b = 7:\n");
dumpBytes(&w, sizeof(w));
printf("values are now %d %d %.15lf\n",w.a,w.q.x,w.q.b);
return 0;
}

And the output:

intial value:
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

All zeros (I declared the variable static - that makes sure everything will be initialized). Note that the function prints out 16 bytes, even though you might have thought that a struct whose biggest element is double + int should only be 12 bytes long. This is related to byte alignment - when the largest element is 8 bytes long, the structure will be aligned on 8 bit boundaries.

setting w.q.b = 7:
00 00 00 00 00 00 1c 40 00 00 00 00 00 00 00 00 

Let's look at the bytes representing the double in their correct order:

40 1c 00 00 00 00 00 00

Sign bit = 0
exponent = 1 0000 0000 0111b  (binary representation)
mantissa = 0

setting w.a = 4:
04 00 00 00 00 00 1c 40 00 00 00 00 00 00 00 00 

When we now write a, we have modified the first byte. This corresponds to the least significant bits of the mantissa, which is now (in hex):

00 00 00 00 00 00 04

Now the format of the mantissa implies a 1 to the left of this number; so changing the last bits from 0 to 4 in changed the magnitude of the number by just a tiny fraction - you need to look at the 15th decimal to see it.

setting w.q.x = 5:
04 00 00 00 00 00 1c 40 05 00 00 00 00 00 00 00 

The value 5 is written in its own little space

values are now 4 5 7.000000000000004

Note - when I used a large number of digits, you can see that the least significant part of b is not exactly 7 - even though a double is perfectly capable of representing an integer accurately.

setting w.q.b = 7:
00 00 00 00 00 00 1c 40 05 00 00 00 00 00 00 00 
values are now 0 5 7.000000000000000

After writing 7 into the double again, you can see that the first byte is once again 00, and now the result of the printf statement is indeed 7.0 exactly.

So - your understanding was correct. The problem was in your diagnosis - the number was different but you couldn't see it.

Usually a good way to look for these things is to just store the number in a temporary variable, and look at the difference. You would have found it easily enough, then.

like image 149
Floris Avatar answered Jan 21 '23 16:01

Floris