I understand that in C, if a variable is explicitly specified with the register
keyword, then one cannot use the &
operator on it, and that makes sense to me that there's no such thing as "address" of a variable that's always kept in a register.
My question is, if the compiler decides on its own to store a variable in register rather than spilling it, then what happens with the &
operator during code execution?
I can think of two ways that the compiler might try to handle this:
&
behavior, but this seems hairy and I have no idea how one would do this rigorously and efficiently.&
operator is used with this variable.Does C take one of these approaches or does it do something else in this case?
Register variables are stored in registers. Static variable is stored in the memory of the data segment. In register variables, CPU itself stores the data and access quickly.
1) If you use & operator with a register variable then compiler may give an error or warning (depending upon the compiler you are using), because when we say a variable is a register, it may be stored in a register instead of memory and accessing address of a register is invalid. Try below program.
A pointer is a variable that stores a memory address. Pointers are used to store the addresses of other variables or memory items.
No, you can't take the address of a variable declared register. It tells the compiler to try to use a CPU register, instead of RAM, to store the variable. Registers are in the CPU and much faster to access than RAM. But it's only a suggestion to the compiler, and it may not follow through.
1) If you use & operator with a register variable then compiler may give an error or warning (depending upon the compiler you are using), because when we say a variable is a register, it may be stored in a register instead of memory and accessing address of a register is invalid. Try below program.
The compiler maps every variable name to a unique memory address and incorporates the address into the machine code. Together, the compiler and the operating system determine the location in memory of each variable. So, a variable is a named location in main memory that has three characteristics: a name, a content, and a memory address.
It’s compiler’s choice to put it in a register or not. Generally, compilers themselves do optimizations and put the variables in register.
The compiler does its best to ensure that the most frequently accessed variables are stored in registers, whenever possible. 8 clever moves when you have $1,000 in the bank.
The fact that a variable is implemented in a hardware register or not has to be completely transparent to the user. It is up to the compiler to realize this such that the current value can always be accessed through the pointer.
That changes when the user decides to declare the variable with the keyword register
. Then the use of the &
operator is simply prohibited.
I'm not a compiler expert, but my impression is it's sort of the other way around. First the compiler tries to optimize the code. If after optimization, the address of the variable is not needed, then it is a candidate to be put in a register (or perhaps optimized completely out of existence).
Of course if the &
operator is never applied to the variable at all, then it is certainly a candidate. But even if &x
does appear in the source code, the need for the address may go away after optimization.
As a trivial example, if we have
int x = 7;
foo(*&x);
the compiler can see that *&x
is exactly equivalent to x
, and so the code can be treated as if it were just foo(x)
. If the address of x
is not taken anywhere else, then it no longer needs to have an address at all, and can go in a register.
Now you can imagine extending this sort of analysis to more complicated code.
int x = foo1(), y = foo2();
int *p;
p = cond ? &x : &y;
return *p;
Try it on godbolt
Conceptually, this can be successively rewritten as
return *(cond ? &x : &y);
return cond ? *&x : *&y;
return cond ? x : y;
and now x,y
no longer need to have addresses, and p
no longer needs to exist at all.
So in other words, the compiler doesn't try to "emulate" the &
operator; rather, it tries to restructure the code so that it simply isn't needed.
The most common situation where this is not possible is if the address of the variable is passed to another function.
int x;
foo(&x);
Unless foo
can be inlined or some other sort of interprocedural analysis is available, the compiler really does have to pass the address of something to foo
, and so x
has to exist in memory, at least for that moment. Of course, the compiler can choose to move it into a register immediately thereafter, and keep it there for the rest of the function if its address is not needed again; the question of whether a variable lives in memory or in a register need not be fixed for all time.
First of all, please note that what happens to non-register
variables during optimization is beyond the scope of the C language. And there is yet another option: remove the variable entirely from the machine code.
My question is, if the compiler decides on its own to store a variable in register rather than spilling it, then what happens with the & operator during code execution?
The compiler is not likely going to place a variable in a register if it spots the presence of the &
operator. In fact every real-world compiler I have ever used will place such a variable either on the stack or in static storage memory, thereby making it addressable.
Streaming compilers that generate assembly ASAP as they parse C code (e.g., tinycc) simply won't put variables in registers unless those variables have the register
storage class specifier. A nonstreaming compiler that builds an abstract syntax tree, on the other hand, can decide whether to put something in a register only after it's seen the whole block. Then it can know for sure that the variable's address won't ever be needed (or that all accesses to the variable through its address can be optimized into direct register accesses).
Since C cannot interpret a dynamically inputted piece of C code at runtime within the context of its statically provided code (no _Eval(read_string())
where the user could input printf("%p\n",(void*)&some_local);
), there cannot be any surprise address-taking at runtime. After a C compiler is finished with a block, it knows how every local in it will ever be used.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With