Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do pointers work "under the hood" in C?

Take a simple program like this:

int main(void)
{
    char p;
    char *q;

    q = &p;

    return 0;
}

How is &p determined? Does the compiler calculate all such references before-hand or is it done at runtime? If at runtime, is there some table of variables or something where it looks these things up? Does the OS keep track of them and it just asks the OS?

My question may not even make sense in the context of the correct explanation, so feel free to set me straight.

like image 826
Chris Middleton Avatar asked Mar 14 '14 17:03

Chris Middleton


People also ask

How C works under the hood?

Compilation and Linking Process Your compiler is the file gcc, and your linker is the file ld. The compiler transforms the source code written in C into the object code. The linker transforms the object code files into an executable program.

How do pointers work in C?

The Pointer in C, is a variable that stores address of another variable. A pointer can also be used to refer to another pointer function. A pointer can be incremented/decremented, i.e., to point to the next/ previous memory location. The purpose of pointer is to save memory space and achieve faster execution time.

Is pointers in C tough?

Pointers are arguably the most difficult feature of C to understand. But, they are one of the features which make C an excellent language. In this article, we will go from the very basics of pointers to their usage with arrays, functions, and structure.

How do pointers work?

A pointer is a variable that stores a memory address. Pointers are used to store the addresses of other variables or memory items. Pointers are very useful for another type of parameter passing, usually referred to as Pass By Address. Pointers are essential for dynamic memory allocation.


1 Answers

How is &p determined? Does the compiler calculate all such references before-hand or is it done at runtime?

This is an implementation detail of the compiler. Different compilers can choose different techniques depending on the kind of operating system they are generating code for and the whims of the compiler writer.

Let me describe for you how this is typically done on a modern operating system like Windows.

When the process starts up, the operating system gives the process a virtual address space, of, let's say 2GB. Of that 2GB, a 1MB section of it is set aside as "the stack" for the main thread. The stack is a region of memory where everything "below" the current stack pointer is "in use", and everything in that 1MB section "above" it is "free". How the operating system chooses which 1MB chunk of virtual address space is the stack is an implementation detail of Windows.

(Aside: whether the free space is at the "top" or "bottom" of the stack, whether the "valid" space grows "up" or "down" is also an implementation detail. Different operating systems on different chips do it differently. Let's suppose the stack grows from high addresses to low addresses.)

The operating system ensures that when main is invoked, the register ESP contains the address of the dividing line between the valid and free portions of the stack.

(Aside: again, whether the ESP is the address of the first valid point or the first free point is an implementation detail.)

The compiler generates code for main that pushes the stack pointer by lets say five bytes, by subtracting from it if the stack is growing "down". It decreases by five because it needs one byte for p and four for q. So the stack pointer changes; there are now five more "valid" bytes and five fewer "free" bytes.

Let's say that q is the memory that is now in ESP through ESP+3 and p is the memory now in ESP+4. To assign the address of p to q, the compiler generates code that copies the four byte value ESP+4 into the locations ESP through ESP+3.

(Aside: Note that it is highly likely that the compiler lays out the stack so that everything that has its address taken is on an ESP+offset value that is divisible by four. Some chips have requirements that addresses be divisible by pointer size. Again, this is an implementation detail.)

If you do not understand the difference between an address used as a value and an address used as a storage location, figure that out. Without understanding that key difference you will not be successful in C.

That's one way it could work but like I said, different compilers can choose to do it differently as they see fit.

like image 176
Eric Lippert Avatar answered Oct 05 '22 05:10

Eric Lippert