Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Rust know whether to run the destructor during stack unwind?

Tags:

The documentation for mem::uninitialized points out why it is dangerous/unsafe to use that function: calling drop on uninitialized memory is undefined behavior.

So this code should be, I believe, undefined:

let a: TypeWithDrop = unsafe { mem::uninitialized() }; panic!("=== Testing ==="); // Destructor of `a` will be run (U.B) 

However, I wrote this piece of code which works in safe Rust and does not seem to suffer from undefined behavior:

#![feature(conservative_impl_trait)]  trait T {     fn disp(&mut self); }  struct A; impl T for A {     fn disp(&mut self) { println!("=== A ==="); } } impl Drop for A {     fn drop(&mut self) { println!("Dropping A"); } }  struct B; impl T for B {     fn disp(&mut self) { println!("=== B ==="); } } impl Drop for B {     fn drop(&mut self) { println!("Dropping B"); } }  fn foo() -> impl T { return A; } fn bar() -> impl T { return B; }  fn main() {     let mut a;     let mut b;      let i = 10;     let t: &mut T = if i % 2 == 0 {         a = foo();         &mut a     } else {         b = bar();         &mut b     };      t.disp();     panic!("=== Test ==="); } 

It always seems to execute the right destructor, while ignoring the other one. If I tried using a or b (like a.disp() instead of t.disp()) it correctly errors out saying I might be possibly using uninitialized memory. What surprised me is while panicking, it always runs the right destructor (printing the expected string) no matter what the value of i is.

How does this happen? If the runtime can determine which destructor to run, should the part about memory mandatorily needing to be initialized for types with Drop implemented be removed from documentation of mem::uninitialized() as linked above?

like image 517
ustulation Avatar asked Sep 28 '16 14:09

ustulation


2 Answers

Using drop flags.

Rust (up to and including version 1.12) stores a boolean flag in every value whose type implements Drop (and thus increases that type's size by one byte). That flag decides whether to run the destructor. So when you do b = bar() it sets the flag for the b variable, and thus only runs b's destructor. Vice versa with a.

Note that starting from Rust version 1.13 (at the time of this writing the beta compiler) that flag is not stored in the type, but on the stack for every variable or temporary. This is made possible by the advent of the MIR in the Rust compiler. The MIR significantly simplifies the translation of Rust code to machine code, and thus enabled this feature to move drop flags to the stack. Optimizations will usually eliminate that flag if they can figure out at compile time when which object will be dropped.

You can "observe" this flag in a Rust compiler up to version 1.12 by looking at the size of the type:

struct A;  struct B;  impl Drop for B {     fn drop(&mut self) {} }  fn main() {     println!("{}", std::mem::size_of::<A>());     println!("{}", std::mem::size_of::<B>()); } 

prints 0 and 1 respectively before stack flags, and 0 and 0 with stack flags.

Using mem::uninitialized is still unsafe, however, because the compiler still sees the assignment to the a variable and sets the drop flag. Thus the destructor will be called on uninitialized memory. Note that in your example the Drop impl does not access any memory of your type (except for the drop flag, but that is invisible to you). Therefor you are not accessing the uninitialized memory (which is zero bytes in size anyway, since your type is a zero sized struct). To the best of my knowledge that means that your unsafe { std::mem::uninitialized() } code is actually safe, because afterwards no memory unsafety can occur.

like image 187
oli_obk Avatar answered Oct 07 '22 22:10

oli_obk


There are two questions hidden here:

  1. How does the compiler track which variable is initialized or not?
  2. Why may initializing with mem::uninitialized() lead to Undefined Behavior?

Let's tackle them in order.


How does the compiler track which variable is initialized or not?

The compiler injects so-called "drop flags": for each variable for which Drop must run at the end of the scope, a boolean flag is injected on the stack, stating whether this variable needs to be disposed of.

The flag starts off "no", moves to "yes" if the variable is initialized, and back to "no" if the variable is moved from.

Finally, when comes the time to drop this variable, the flag is checked and it is dropped if necessary.

This is unrelated as to whether the compiler's flow analysis complains about potentially uninitialized variables: only when the flow analysis is satisfied is code generated.


Why may initializing with mem::uninitialized() lead to Undefined Behavior?

When using mem::uninitialized() you make a promise to the compiler: don't worry, I'm definitely initializing this.

As far as the compiler is concerned, the variable is therefore fully initialized, and the drop flag is set to "yes" (until you move out of it).

This, in turn, means that Drop will be called.

Using an uninitialized object is Undefined Behavior, and the compiler calling Drop on an uninitialized object on your behalf counts as "using it".


Bonus:

In my tests, nothing weird happened!

Note that Undefined Behavior means that anything can happen; anything, unfortunately, also includes "seems to work" (or even "works as intended despite the odds").

In particular, if you do NOT access the object's memory in Drop::drop (just printing), then it's very likely that everything will just work. If you do access it, however, you might see weird integers, pointers pointing into the wild, etc...

And if the optimizer is clever, even without accessing it, it might do weird things! Since we are using LLVM, I invite you to read What every C programmer should know about Undefined Behavior by Chris Lattner (LLVM's father).

like image 29
Matthieu M. Avatar answered Oct 07 '22 22:10

Matthieu M.