Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are consecutive int data type variables located at 12 bytes offset in visual studio?

To clarify the question, please observe the c/c++ code fragment:

int a = 10, b = 20, c = 30, d = 40; //consecutive 4 int data values.

int* p = &d; //address of variable d.

Now, in visual studio (tested on 2013), if value of p == hex_value (which can be viewed in debugger memory window), then, you can observe that, the addresses for other variables a, b, c, and d are each at a 12 byte difference!

So, if p == hex_value, then it follows:

&c == hex_value + 0xC (note hex C is 12 in decimal)

&b == &c + 0xC

&a == &b + 0xC 

So, why is there a 12 bytes offset instead of 4 bytes -- int are just 4 bytes?

Now, if we declared an array:

int array[]  = {10,20,30,40};

The values 10, 20, 30, 40 each are located at 4 bytes difference as expected!

Can anyone please explain this behavior?

like image 831
user3330840 Avatar asked Aug 22 '14 06:08

user3330840


1 Answers

The standard C++ states in section 8.3.4 Arrays that "An object of array type contains a contiguously allocated non-empty set of N subobjects of type T."

This is why, array[] will be a set of contiguous int, and that difference between one element and the next will be exactly sizeof(int).

For local/block variables (automatic storage), no such guarantee is given. The only statements are in section 1.7. The C++ memory model: "Every byte has a unique address." and 1.8. The C++ object model: "the address of that object is the address of the first byte it occupies. Two objects (...) shall have distinct addresses".

So everything that you do assuming contiguousness of such objects would be undefined behavior and non portable. You cannot even be sure of the order of the addresses in which these objects are created.

Now I have played with a modified version of your code:

int a = 10, b = 20, c = 30, d = 40; //consecutive 4 int data values.
int* p = &d; //address of variable d.
int array[] = { 10, 20, 30, 40 };
char *pa = reinterpret_cast<char*>(&a), 
     *pb = reinterpret_cast<char*>(&b), 
     *pc = reinterpret_cast<char*>(&c), 
     *pd = reinterpret_cast<char*>(&d);
cout << "sizeof(int)=" << sizeof(int) << "\n &a=" << &a << \
  " +" << pa - pb << "char\n &b=" << &b << \
  " +" << pb - pc  << "char\n &c=" << &c << \
  " +" << pc - pd << "char\n &d=" << &d;
memset(&d, 0, (&a - &d)*sizeof(int));    
// ATTENTION:  undefined behaviour:  
// will trigger core dump on leaving 
// "Runtime check #2, stack arround the variable b was corrupted". 

When running this code I get:

debug                   release                comment on release

sizeof(int)=4           sizeof(int)=4       
 &a=0052F884 +12char     &a=009EF9AC +4char
 &b=0052F878 +12char     &b=009EF9A8 +-8char   // is before a 
 &c=0052F86C +12char     &c=009EF9B0 +12char   // is just after a !!
 &d=0052F860             &d=009EF9A4

So you see that the order of the addresses may even be altered on the same compiler, depending on the build options !! In fact, in release mode the variables are contiguous but not in the same order.

The extra spaces on the debug version come from option /RTCs. I have on purpose overwritten the variables with a harsh memset() that assumes they are contiguous. Upon exit of the execution, I get immediately a message: "Runtime check #2, stack arround the variable b was corrupted", which clearly demonstrate the purpose of these extra chars.
If you remove the option, you will get with MSVC13 contiguous variables, each of 4 bytes as you did expect. But there will be no more error message about corruption of stack either.

like image 112
Christophe Avatar answered Nov 08 '22 13:11

Christophe