Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the stack handle popping values off in a different order than they are created?

Tags:

stack

rust

I can't think of how to title this effectively, which might be why I cannot find a good search for this. In understanding the heap vs. stack during the Rust programming book I read:

The types covered in the “Data Types” section are all stored on the stack and popped off the stack when their scope is over.

I read the more in-depth distinction, but I'm still curious how a stack is used when variables can be defined and used hundreds of lines later. Take for example:

let x = 5;
let y = 6;
println!("x = {}", x); // cannot just pop x off here as y was pushed last?

To clarify, is a stack being used for scope, but inside each scope there also has to be a heap to know where to look during runtime?

Does this mean that the stack is allocated on the heap as well? Or does the compiler keep these two completely separated?

Sorry if this is morphing into a question about compilers and memory management in general.

fn main() {                  // if this is all a stack
    let one = 1;             // stack.push(one)
    let two = 2;             // stack.push(two) 

    let three = one + two;   // stack.push(stack.pop() + stack.pop())???
}

Does this make sense? I'm coming from Python so bear with me.

like image 533
Tony Avatar asked Jan 29 '23 10:01

Tony


1 Answers

That quote is playing a little fast and loose with the truth because it's only intended as a first estimation. A different quote from the same page is more accurate (emphasis mine):

When our code calls a function, the values passed into the function (including, potentially, pointers to data on the heap) and the function’s local variables get pushed onto the stack. When the function is over, those values get popped off the stack.

The "stack" (data structure) you may be familiar with is related to but quite different from the "stack" (function-calling) that is referred to here.

In a stack (data structure) you only access the value at the top of the stack. With the stack (function-calling), you reserve/return space from the top but you can access any memory in the stack. This could be any of the variables in the same function or even the variables from the calling function. The hardware doesn't care about these details; it is up to the language to ensure that items in the stack are only accessed when it is valid to do so.

It's important to recognize that the values of the variables are not being pushed onto the stack; the stack only corresponds to memory to store the variables.

The modification to the size of the stack only happens at function entry and exit. It's also done as one giant block; the stack pointer is incremented and decremented by the total size of all the function's local variables, regardless of their scope. Individual variables are never popped onto or off of the stack.

Rust's scoping rules prevent you from using the values that have been moved, but in actuality they are still sitting there on the stack. You can access them using unsafe code (but you shouldn't):

struct NonCopyNumber(i32);

fn example() {
    // We allocate space on the stack for all local variable
    // when we enter the function. There's 4 in this example.

    let value1 = NonCopyNumber(1); 

    let raw_ref1 = &value1.0 as *const i32;
    let raw_ref2;

    {
        let value2 = NonCopyNumber(2); 
        raw_ref2 = &value2.0 as *const i32;
    }

    drop(value1);

    // println!("{}", value1.0); // use of moved value: `value1.0`
    // println!("{}", value2.0); // Can't find value

    // Not really safe; assumes details about stack management.
    unsafe {
        println!("{}", *raw_ref1); // 1
        println!("{}", *raw_ref2); // 2
    }

    // Stack is popped
}

do inner set of braces create a separate stack frame

No, stack frames are only created at function entry

stack frame (aka scope)

Stack frames are not the same as scopes. Entering a function creates a new scope and introduces a stack frame. A pair of curly braces creates a new scope but does not introduce a stack frame .

does value2 gets popped upon reaching the closing inner brace?

No, stack frames are only popped at function exit.


Let's use this concrete example:

fn main() {
    let one: i32 = 1;
    let two: i32 = 2; 

    let three: i32 = one + two;
}

I'll annotate the stack with the stack pointer variable %sp. The stack pointer represents the current top of the stack.

fn main() {
    // This function has 3 `i32` variables. 
    // Each is 4 bytes so the function requires 12 
    // bytes of stack space total.

    let one;   // Stored in memory at %sp + 0 bytes
    let two;   // Stored in memory at %sp + 4 bytes
    let three; // Stored in memory at %sp + 8 bytes

    %sp += 12; // Increment / push the stack

    one = 1;
    two = 2; 

    three = one + two;

    // Done with all our variables.

    %sp -= 12; // decrement / pop the stack 
}

Space for all variables is reserved at the beginning of the function, then all the space is returned to the stack at the end of the function, all at once in both cases.

It may be worth noting that we could have added lots of extra braces and still arrived at the same stack-annotated result:

fn main() {
    let one: i32 = 1;
    {
        let two: i32 = 2;
        {
            let three: i32 = one + two;
        }
    }
}

It is only the semantics of the language that prevent us from using a variable after it goes out of scope or before it has been initialized.


Does this mean that the stack is allocated on the heap as well?

This is a very difficult question to answer succinctly. In the computer 1, you only have one chunk of memory: those RAM chips plugged into your motherboard. All of your program's data is ultimately stored here, whether it's "the stack" or "the heap". The big difference between the two is their access patterns.

As shown above, the stack is very lightweight and performant but inflexible. You can reserve more space in the stack (calling a function) and return that space to the stack (leave the function) and that's about it.

The heap is more flexible, but is slower, more complicated and requires additional bookkeeping.

Generally, the stack and heap grow toward each other in memory. One starts at zero and grows upward, the other starts at MAX and grows downward.

Then you get into details like threads. These carve off a piece of memory, perhaps from the heap, and then use that memory for their own stack. So, technically, yes: sometimes a stack is inside a heap.

Making it more complicated, there can actually be multiple heaps, each managing memory according to their own set of rules.

1 Not all computers, but the vast majority that people program for day-to-day. There's also different levels of memory for caches and whatnot, but those don't come into play here.


println!("x = {}", x); // cannot just pop x off here as y was pushed last?

Using a variable in println doesn't move it, so even if your original premise were true, it wouldn't come into play here. See Does println! borrow or own the variable?

like image 174
Shepmaster Avatar answered Feb 03 '23 14:02

Shepmaster