Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't the compiler know the addresses of local variables at compile-time?

What does the following statement mean?

Local and dynamically allocated variables have addresses that are not known by the compiler when the source file is compiled

I used to think that local variables are allocated addresses at compile time, but this address can change when it will go out of scope and then come in scope again during function calling. But the above statement says addresess of local variables are not known by the compiler. Then how are local variables allocated? Why can global variables' addresses be known at compile time??

Also, can you please provide a good link to read how local variables and other are allocated?

Thanks in advance!

like image 921
T.J. Avatar asked Feb 08 '12 02:02

T.J.


2 Answers

The above quote is correct - the compiler typically doesn't know the address of local variables at compile-time. That said, the compiler probably knows the offset from the base of the stack frame at which a local variable will be located, but depending on the depth of the call stack, that might translate into a different address at runtime. As an example, consider this recursive code (which, by the way, is not by any means good code!):

int Factorial(int num) {
    int result;
    if (num == 0)
        result = 1;
    else
        result = num * Factorial(num - 1);

    return result;
}

Depending on the parameter num, this code might end up making several recursive calls, so there will be several copies of result in memory, each holding a different value. Consequently, the compiler can't know where they all will go. However, each instance of result will probably be offset the same amount from the base of the stack frame containing each Factorial invocation, though in theory the compiler might do other things like optimizing this code so that there is only one copy of result.

Typically, compilers allocate local variables by maintaining a model of the stack frame and tracking where the next free location in the stack frame is. That way, local variables can be allocated relative to the start of the stack frame, and when the function is called that relative address can be used, in conjunction with the stack address, to look up the location of that variable in the particular stack frame.

Global variables, on the other hand, can have their addresses known at compile-time. They differ from locals primarily in that there is always one copy of a global variable in a program. Local variables might exist 0 or more times depending on how execution goes. As a result of the fact that there is one unique copy of the global, the compiler can hardcode an address in for it.

As for further reading, if you'd like a fairly in-depth treatment of how a compiler can lay out variables, you may want to pick up a copy of Compilers: Principles, Techniques, and Tools, Second Edition by Aho, Lam, Sethi, and Ullman. Although much of this book concerns other compiler construction techniques, a large section of the book is dedicated to implementing code generation and the optimizations that can be used to improve generated code.

Hope this helps!

like image 117
templatetypedef Avatar answered Nov 15 '22 17:11

templatetypedef


In my opinion the statement is not talking about runtime access to variables or scoping, but is trying to say something subtler.

The key here is that its "local and dynamically allocated" and "compile time". I believe what the statement is saying is that those addresses can not be used as compile time constants. This is in contrast to the address of statically allocated variables, which can be used as compile time constants. One example of this is in templates:

template<int *>
class Klass
{
};

int x;

//OK as it uses address of a static variable;
Klass<&::x> x_klass;


int main()
{
   int y;
   Klass<&y> y_klass; //NOT OK since y is local.
}

It seems there are some additional constraints on templates that don't allow this to compile:

int main()
{
    static int y;
    Klass<&y> y_klass;
}

However other contexts that use compile time constants may be able to use &y.

And similarly I'd expect this to be invalid:

static int * p;

int main()
{
   p = new int();
   Klass<p> p_klass;
}

Since p's data is now dynamically allocated (even though p is static).

like image 30
Michael Anderson Avatar answered Nov 15 '22 16:11

Michael Anderson