I read the rust book over the weekend and I have a question about the concept of ownership. The impression I got is that ownership is used to statically determine where a resource can be deallocated. Now, suppose that we have the following:
{ // 1
let x; // 2
{ // 3
let y = Box::new(1); // 4
x = if flip_coin() {y} else {Box::new(2)} // 5
} // 6
} // 7
I was surprised to see that the compiler accepts this program. By inserting println!
s and implementing the Drop
trait for the boxed value, I saw that the box containing the value 1 will be deallocated at either line 6 or 7 depending on the return value of flip_coin
. How does the compiler know when to deallocate that box? Is this decided at run-time using some run-time information (like a flag to indicate if the box is still in use)?
After some research I found out that Rust currently adds a flag to every type that implements the Drop
trait so that it knows whether the value has been dropped or not, which of course incurs a run-time cost. There have been proposals to avoid that cost by using static drops or eager drops but those solutions had problems with their semantics, namely that drops could occur at places that you wouldn't expect (e.g. in the middle of a code block), especially if you are used to C++ style RAII. There is now consensus that the best compromise is a different solution where the flags are removed from the types. Instead flags will be added to the stack, but only when the compiler cannot figure out when to do the drop
statically (while having the same semantics as C++) which specifically happens when there are conditional moves like the example given in this question. For all other cases there will be no run-time cost. It appears though, that this proposal will not be implemented in time for 1.0.
Note that C++ has similar run-time costs associated with unique_ptr
. When the new Drop
is implemented, Rust will be strictly better than C++ in that respect.
I hope this is a correct summary of the situation. Credit goes to u/dyoll1013, u/pcwalton, u/!!kibwen, u/Kimundi on reddit, and Chris Morgan here on SO.
In non-optimized code, Rust uses dynamic checks, but it's likely that they will be eliminated in optimized code.
I looked at the behavior of the following code:
#[derive(Debug)]
struct A {
s: String
}
impl Drop for A {
fn drop(&mut self) {
println!("Dropping {:?}", &self);
}
}
fn flip_coin() -> bool { false }
#[allow(unused_variables)]
pub fn test() {
let x;
{
let y1 = A { s: "y1".to_string() };
let y2 = A { s: "y2".to_string() };
x = if flip_coin() { y1 } else { y2 };
println!("leaving inner scope");
}
println!("leaving middle scope");
}
Consistent with your comment on the other answer, the call to drop
for the String that was left alone occurs after the "leaving inner scope" println. That does seem consistent with one's expectation that the y's scopes extend to the end of their block.
Looking at the assembly language, compiled without optimization, it seems that the if
statement not only copies either y1 or y2 to x, but also zeroes out whichever variable provided the source for the move. Here's the test:
.LBB14_8:
movb -437(%rbp), %al
andb $1, %al
movb %al, -177(%rbp)
testb $1, -177(%rbp)
jne .LBB14_11
jmp .LBB14_12
Here's the 'then' branch, which moves the "y1" String to x. Note especially the call to memset
, which is zeroing out y1 after the move:
.LBB14_11:
xorl %esi, %esi
movl $32, %eax
movl %eax, %edx
leaq -64(%rbp), %rcx
movq -64(%rbp), %rdi
movq %rdi, -176(%rbp)
movq -56(%rbp), %rdi
movq %rdi, -168(%rbp)
movq -48(%rbp), %rdi
movq %rdi, -160(%rbp)
movq -40(%rbp), %rdi
movq %rdi, -152(%rbp)
movq %rcx, %rdi
callq memset@PLT
jmp .LBB14_13
(It looks horrible until you realize that all those movq
instructions are just copying 32 bytes from %rbp-64
, which is y1, to %rbp-176
, which is x, or at least some temporary that'll eventually be x.) Note that it copies 32 bytes, not the 24 you'd expect for a Vec (one pointer plus two usizes). This is because Rust adds a hidden "drop flag" to the structure, indicating whether the value is live or not, following the three visible fields.
And here's the 'else' branch, doing exactly the same for y2:
.LBB14_12:
xorl %esi, %esi
movl $32, %eax
movl %eax, %edx
leaq -128(%rbp), %rcx
movq -128(%rbp), %rdi
movq %rdi, -176(%rbp)
movq -120(%rbp), %rdi
movq %rdi, -168(%rbp)
movq -112(%rbp), %rdi
movq %rdi, -160(%rbp)
movq -104(%rbp), %rdi
movq %rdi, -152(%rbp)
movq %rcx, %rdi
callq memset@PLT
.LBB14_13:
This is followed by the code for the "leaving inner scope" println, which is painful to behold, so I won't include it here.
We then call a "glue_drop" routine on both y1 and y2. This seems to be a compiler-generated function that takes an A, checks its String's Vec's drop flag, and if that's set, invokes A's drop routine, followed by the drop routine for the String it contains.
If I'm reading this right, it's pretty clever: even though it's the A that has the drop
method we need to call first, Rust knows that it can use ... inhale ... the drop flag of the Vec inside the String inside the A as the flag that indicates whether the A needs to be dropped.
Now, when compiled with optimization, inlining and flow analysis should recognize situations where the drop definitely will happen (and omit the run-time check), or definitely will not happen (and omit the drop altogether). And I believe I have heard of optimizations that duplicate the code following a then/else clause into both paths, and then specialize them. This would eliminate all run-time checks from this code (but duplicate the println! call).
As the original poster points out, there's an RFC proposal to move drop flags out of the values and instead associate them with the stack slots holding the values.
So it's plausible that the optimized code might not have any run-time checks at all. I can't bring myself to read the optimized code, though. Why not give it a try yourself?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With