Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where and how are constants stored?

I read this question from here and I also read related question from c-faq but I don't understand the exact reason behind this :-

#include <iostream>

    using namespace std;

    int main()
    {
        //const int *p1 = (int*) &(5);  //error C2101: '&' on constant
        //cout << *p1;

        const int five = 5;
        const int *p2 = &(five);
        cout << *p2 << endl;

        char *chPtr = (char*) &("abcde");
        for (int i=0; i<4; i++) cout << *(chPtr+i);
        cout << endl;
        return 0;
    }

I was wondering how constants, either integer or string literal, get stored. My understanding of string literals is that they are created in global static memory upon start of program and persist until program exit. In the case of "abcde" even though I did not give it a variable name I can take it's address (chPtr) and I assume I could probably dereference chPtr any time before program termination and the character values would still be there, even if I dereferenced it outside the scope where it was declared. Is the const int variable "five" also placed in global static and that address p2 can also be referenced any time?

Why can I take the address of "five" but I cannot ask for: &(5) ? Are the constants "5" and "five" stored differently? and where "5" is get stored in memory ?

like image 754
Vikas Verma Avatar asked Dec 20 '22 18:12

Vikas Verma


2 Answers

You cannot take the address of a literal (e.g. &(5)) because the literal is not "stored" anywhere - it is actually written in the assembly instruction. Depending on the platform, you'll get different instructions, but a MIPS64 addition example would look like this:

DADDUI R1, R1, #5

Trying to take the address of the immediate is meaningless as it doesn't reside in (data) memory, but is actually part of the instruction.

If you declare a const int i = 5, and do not need the address of it, the compiler can (and likely will) convert it to a literal and place 5 in the appropriate assembly instructions. Once you attempt to take the address of i, the compiler will see that it can no longer do that, and will place it in memory. This is not the case if you just attempt to take the address of a literal because you haven't indicated to the compiler that it needed to allocate space for a variable (when you declare a const int i, it allocates the space in the first pass, and will later determine it no longer needs it - it does not function in the reverse).

String constants are stored in the static portion of the data memory - which is why you can take the address of them.

like image 112
Zac Howland Avatar answered Dec 31 '22 03:12

Zac Howland


"It depends" is probably not a satisfying answer, but it is the correct one. The compiler will store some const variables in the stack if it needs to (such as if you ever take the address of it). However, there has always been the idea of a "constexpr" variable in compilers, even if we didn't always have the mechansim to call it directly: If an expression can be calculated at compile time, then instead of caluclating it at run time, we can calculate it durring compile time. And if we can calculate it at compile time, and we never do anything that requires it to be something different, then we can remove it all together and turn it into a literal, which would be part of the instruction!

Take for example, the following code:

int main(int argc, char** argv)
{
  const int a = 2;
  const int b = 3;
  const int c = a+b;

  volatile int d = 6;
  volatile int e = c+d;

  std::cout << e << std::endl;
  return 0;
}

Look at how smart the compiler is:

    37    const int a = 2;
    38    const int b = 3;
    39    const int c = a+b;
    40
    41    volatile int d = 6;
0x400949  <+0x0009>         movl   $0x6,0x8(%rsp)
    42    volatile int e = c+d;
0x400951  <+0x0011>         mov    0x8(%rsp),%eax
0x400955  <+0x0015>         add    $0x5,%eax
0x400958  <+0x0018>         mov    %eax,0xc(%rsp)
    43
    44    std::cout << e << std::endl;
0x400944  <+0x0004>         mov    $0x601060,%edi
0x40095c  <+0x001c>         mov    0xc(%rsp),%esi
0x400960  <+0x0020>         callq  0x4008d0 <_ZNSolsEi@plt>
    45    return 0;
    46  }

(volatile tells the compiler not to do fancy memory tricks to that variable) In line 41, when I use c, the add is done with the LITERAL 0x5, despite it even be a combination of the other code. Lines 37-39 contain NO instructions.

Now lets change the code so that I need the location of a:

int main(int argc, char** argv)
{
  const int a = 2;
  const int b = 3;
  const int c = a+b;

  volatile int d = 6;
  volatile int e = c+d;
  volatile int* f = (int*)&a;
  volatile int g = *f;

  std::cout << e << std::endl;
  std::cout << g << std::endl;
  return 0;
}


    37        const int a = 2;
0x400955  <+0x0015>         movl   $0x2,(%rsp)
    38        const int b = 3;
    39        const int c = a+b;
    40
    41        volatile int d = 6;
0x400949  <+0x0009>         movl   $0x6,0x4(%rsp)
    42        volatile int e = c+d;
0x400951  <+0x0011>         mov    0x4(%rsp),%eax
0x40095c  <+0x001c>         add    $0x5,%eax
0x40095f  <+0x001f>         mov    %eax,0x8(%rsp)
    43        volatile int* f = (int*)&a;
    44        volatile int g = *f;
0x400963  <+0x0023>         mov    (%rsp),%eax
0x400966  <+0x0026>         mov    %eax,0xc(%rsp)
    45
    46        std::cout << e << std::endl;
0x400944  <+0x0004>         mov    $0x601060,%edi
0x40096a  <+0x002a>         mov    0x8(%rsp),%esi
0x40096e  <+0x002e>         callq  0x4008d0 <_ZNSolsEi@plt>
    47        std::cout << g << std::endl;
0x40097b  <+0x003b>         mov    0xc(%rsp),%esi
0x40097f  <+0x003f>         mov    $0x601060,%edi
0x400984  <+0x0044>         callq  0x4008d0 <_ZNSolsEi@plt>
    48        return 0;

So we can see that a is initialized into actual memory space, on the stack (I can tell cuz the rsp). But wait...c is dependent on a, but whenever I use c it is still a literal 5! What is happening here? Well, the compiler knows that a needs to be in a memory location because of the way it is used. However, it knows that the variable's value is never NOT 2, so whenever I use it in ways that don't need the memory, I can use it as a literal 2. Which means the a in line 37 is not the same as the a in line 43.

So where are const variables stored? They are stored where they NEED to be stored. CRAZY.

(btw, these were all compiled with g++ -g -O2, different compilers/flags will optimize it differently, this mostly demonstrates what the compiler can do, the only guarantee is that your code will behave correctly.)

like image 33
IdeaHat Avatar answered Dec 31 '22 04:12

IdeaHat