Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the Rust compiler not reuse the memory on the stack after an object is moved?

I thought that once an object is moved, the memory occupied by it on the stack can be reused for other purpose. However, the minimal example below shows the opposite.

#[inline(never)]
fn consume_string(s: String) {
    drop(s);
}

fn main() {
    println!(
        "String occupies {} bytes on the stack.",
        std::mem::size_of::<String>()
    );

    let s = String::from("hello");
    println!("s at {:p}", &s);
    consume_string(s);

    let r = String::from("world");
    println!("r at {:p}", &r);
    consume_string(r);
}

After compiling the code with --release flag, it gives the following output on my computer.

String occupies 24 bytes on the stack.
s at 0x7ffee3b011b0
r at 0x7ffee3b011c8

It is pretty clear that even if s is moved, r does not reuse the 24-byte chunk on the stack that originally belonged to s. I suppose that reusing the stack memory of a moved object is safe, but why does the Rust compiler not do it? Am I missing any corner case?

Update: If I enclose s by curly brackets, r can reuse the 24-byte chunk on the stack.

#[inline(never)]
fn consume_string(s: String) {
    drop(s);
}

fn main() {
    println!(
        "String occupies {} bytes on the stack.",
        std::mem::size_of::<String>()
    );

    {
        let s = String::from("hello");
        println!("s at {:p}", &s);
        consume_string(s);
    }

    let r = String::from("world");
    println!("r at {:p}", &r);
    consume_string(r);
}

The code above gives the output below.

String occupies 24 bytes on the stack.
s at 0x7ffee2ca31f8
r at 0x7ffee2ca31f8

I thought that the curly brackets should not make any difference, because the lifetime of s ends after calling comsume_string(s) and its drop handler is called within comsume_string(). Why does adding the curly brackets enable the optimization?

The version of the Rust compiler I am using is given below.

rustc 1.54.0-nightly (5c0292654 2021-05-11)
binary: rustc
commit-hash: 5c029265465301fe9cb3960ce2a5da6c99b8dcf2
commit-date: 2021-05-11
host: x86_64-apple-darwin
release: 1.54.0-nightly
LLVM version: 12.0.1

Update 2: I would like to clarify my focus of this question. I want to know the proposed "stack reuse optimization" lies in which category.

  1. This is an invalid optimization. Under certain cases the compiled code may fail if we perform the "optimization".
  2. This is a valid optimization, but the compiler (including both rustc frontend and llvm) is not capable of performing it.
  3. This is a valid optimization, but is temporarily turned off, like this.
  4. This is a valid optimization, but is missed. It will be added in the future.
like image 242
Zhiyao Avatar asked May 12 '21 07:05

Zhiyao


People also ask

How do you allocate memory on the heap in rust?

In Rust, you can allocate memory on the heap with the Box<T> type . Here’s an example: ?????? We allocate space for two variables on the stack. y is 42, as it always has been, but what about x?

What is the difference between the stack and heap in rust?

The stack is very fast, and is where memory is allocated in Rust by default. But the allocation is local to a function call, and is limited in size. The heap, on the other hand, is slower, and is explicitly allocated by your program. But it’s effectively unlimited in size, and is globally accessible.

What is the difference between stack-memory and heap-memory?

Heap-memory is also not threaded-safe as Stack-memory because data stored in Heap-memory are visible to all threads. Size of Heap-memory is quite larger as compared to the Stack-memory. Heap-memory is accessible or exists as long as the whole application (or java program) runs. // is allocated on heap.

Why do we call it stack memory allocation?

We call it stack memory allocation because the allocation happens in function call stack. The size of memory to be allocated is known to compiler and whenever a function is called, its variables get memory allocated on the stack. And whenever the function call is over, the memory for the variables is deallocated.


1 Answers

My TLDR conclusion: A missed optimization opportunity.

So the first thing I did was look into whether your consume_string function actually makes a difference. To do this I created the following (a bit more) minimal example:

struct Obj([u8; 8]);
fn main()
{
    println!(
        "Obj occupies {} bytes on the stack.",
        std::mem::size_of::<Obj>()
    );

    let s = Obj([1,2,3,4,5,6,7,8]);
    println!("{:p}", &s);
    std::mem::drop(s);
    
    let r = Obj([11,12,13,14,15,16,17,18]);
    println!("{:p}", &r);
    std::mem::drop(r);
}

Instead of consume_string I use std::mem::drop which is dedicated to simply consuming an object. This code behaves just like yours:

Obj occupies 8 bytes on the stack.
0x7ffe81a43fa0
0x7ffe81a43fa8

Removing the drop doesn't affect the result.

So the question is then why rustc doesn't notice that s is dead before r goes live. As your second example shows, enclosing s in a scope will allow the optimization.

Why does this work? Because the Rust semantics dictate that an object is dropped at the end of its scope. Since s is in an inner scope, it is dropped before the scope exits. Without the scope, s is alive until the main function exits.

Why doesn't it work when moving s into a function, where it should be dropped on exit? Probably because rust doesn't correctly flag the memory location used by s as free after the function call. As has been mentioned in the comments, it is LLVM that actually handles this optimization (called 'Stack Coloring' as far as I can tell), which means rustc must correctly tell it when the memory is no longer in use. Clearly, from your last example, rustc does it on scope exit, but apparently not when an object is moved.

like image 69
Emoun Avatar answered Sep 19 '22 07:09

Emoun