I'm learning C and currently learn about pointers. I understand the principle of storing the address of a byte in memory as a variable, which makes it possible to get the byte from memory and write to the memory address.
However, I don't understand where the address of a pointer is stored. Let's say the value of a pointer (the address of a byte in memory) is stored somewhere in memory - how can the program know where the pointer is stored? Wouldn't that need a pointer for a pointer resulting in endless pointers for pointers for pointers... ?
UPDATE
The actual question is: "How does the compiler assign memory addresses to variables". And I found this question which points out this topic.
Thanks to everybody who's answered.
This is an implementation detail, but...
Not all addresses are stored in memory. The processor also has registers, which can be used to store addresses. There are only a handful of registers which can be used this way, maybe 16 or 32, compared to the billions of bytes you can store in memory.
Some variables will get stored in registers. If you need to quickly add up some numbers, for example, the compiler might use, e.g., %eax
(which is a register on x86) to accumulate the result. If optimizations are enabled, it is quite common for variables to exist only in registers. Of course, only a few variables can be in registers at any given time, so most variables will need to get written to memory at some point.
If a variable is saved to memory because there aren't enough registers, it is called "spilling". Compilers work very hard to avoid register spilling.
int func()
{
int x = 3;
return x;
// x will probably just be stored in %eax, instead of memory
}
Commonly, one register points to a special region called the "stack". So a pointer used by a function may be stored on the stack, and the address of that pointer can be calculated by doing pointer arithmetic on the stack pointer. The stack pointer doesn't have an address because it's a register, and registers don't have addresses.
void func()
{
int x = 3; // address could be "stack pointer + 8" or something like that
}
The compiler chooses the layout of the stack, giving each function a "stack frame" large enough to hold all of that function's variables. If optimization is disabled, variables will usually each get their own slot in the stack frame. With optimization enabled, slots will be reused, shared, or optimized out altogether.
Another alternative is to store data at a fixed location, e.g., "address 100".
// global variable... could be stored at a fixed location, such as address 100
int x = 3;
int get_x()
{
return x; // returns the contents of address 100
}
This is actually not uncommon. Remember, that "address 100" doesn't correspond to RAM, necessarily—it is actually a virtual address referring to part of your program's virtual address space. Virtual memory allows multiple programs to all use "address 100", and that address will correspond to a different chunk of physical memory in each running program.
Absolute addresses can also be used on systems without virtual memory, or for programs which don't use virtual memory: bootloaders, operating system kernels, and software for embedded systems may use fixed addresses without virtual memory.
An absolute address is specified by the compiler by putting a "hole" in the machine code, called a relocation.
int get_x()
{
return x; // returns the contents of address ???
// Relocation: please put the address of "x" here
}
The linker then chooses the address for x
, and places the address in the machine code for get_x()
.
Yet another alternative is to store data at a location relative to the code that's being executed.
// global variable... could be stored at address 100
int x = 3;
int get_x()
{
// this instruction might appear at address 75
return x; // returns the contents of this address + 25
}
Shared libraries almost always use this technique, which allows the shared library to be loaded at whatever address is available in a program's address space. Unlike programs, shared libraries can't pick their address, because another shared library might pick the same address. Programs can also use this technique, and this is called a "position-independent executable". Programs will be position-independent on systems which lack virtual memory, or to provide additional security on systems with virtual memory, since it makes it harder to write shell code.
Just like with absolute addresses, the compiler will put a "hole" in the machine code and ask the linker to fill it in.
int get_x()
{
return x; // return the contents of here + ???
// Relocation: put the relative address of x here
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With