I have read somewhere that in a language that features pointers, it is not possible for the compiler to decide fully at compile time whether all pointers are used correctly and/or are valid (refer to an alive object) for various reasons, since that would essentially constitute solving the halting problem. That is not surprising, intuitively, because in this case, we would be able to infer the runtime behavior of a program during compile-time, similarly to what's stated in this related question. However, from what I can tell, the Rust language requires that pointer checking be done entirely at compile time (there's no undefined behavior related to pointers, "safe" pointers at least, and there's no "invalid pointer" or "null pointer" runtime exception either). Assuming that the Rust compiler doesn't solve the halting problem, where does the fallacy lie? <ul> <li>Is it the case that pointer checking isn't done entirely at compile-time, and Rust's smart pointers still introduce some runtime overhead compared to, say, raw pointers in C?</li> <li>Or is it possible that the Rust compiler can't make fully correct decisions, and it sometimes needs to Just Trust The Programmer™, probably using one of the lifetime annotations (the ones with the <code><'lifetime_ident></code> syntax)? In this case, does this mean that the pointer/memory safety guarantee is not 100%, and still relies on the programmer writing correct code?</li> <li>Another possibility is that Rust pointers are non-"universal" or restricted in some sense, so that the compiler can infer their properties entirely during compile-time, but they are not as useful as e. g. raw pointers in C or smart pointers in C++.</li> <li>Or maybe it is something completely different and I'm misinterpreting one or more of <code>{ "pointer", "safety", "guaranteed", "compile-time" }</code>.</li> </ul>

Disclaimer: I'm in a bit of a hurry, so this is a bit meandering. Feel free to clean it up. The One Sneaky Trick That Language Designers Hate™ is basically this: Rust can only reason about the <code>'static</code> lifetime (used for global variables and other whole-program lifetime things) and the lifetime of stack (i.e. local) variables: it cannot express or reason about the lifetime of heap allocations. This means a few things. First of all, all of the library types that deal with heap allocations (i.e. <code>Box<T></code>, <code>Rc<T></code>, <code>Arc<T></code>) all own the thing they point to. As a result, they don't actually need lifetimes in order to exist. Where you do need lifetimes is when you're accessing the contents of a smart pointer. For example: <pre class="prettyprint"><code>let mut x: Box<i32> = box 0; *x = 42; </code></pre> What is happening behind the scenes on that second line is this: <pre class="prettyprint"><code>{ let box_ref: &mut Box<i32> = &mut x; let heap_ref: &mut i32 = box_ref.deref_mut(); *heap_ref = 42; } </code></pre> In other words, because <code>Box</code> isn't magic, we have to tell the compiler how to turn it into a regular, run of the mill borrowed pointer. This is what the <code>Deref</code> and <code>DerefMut</code> traits are for. This raises the question: what, exactly, is the lifetime of <code>heap_ref</code>? The answer to this is in the definition of <code>DerefMut</code> (from memory because I'm in a hurry): <pre class="prettyprint"><code>trait DerefMut { type Target; fn deref_mut<'a>(&'a mut self) -> &'a mut Target; } </code></pre> Like I said before, Rust absolutely cannot talk about "heap lifetimes". Instead, it has to tie the lifetime of the heap-allocated <code>i32</code> to the only other lifetime it has on hand: the lifetime of the <code>Box</code>. What this means is that "complicated" things don't have an expressible lifetime, and thus have to own the thing they manage. When you convert a complicated smart pointer/handle into a simple borrowed pointer, that is the moment that you have to introduce a lifetime, and you usually just use the lifetime of the handle itself. Actually, I should clarify: by "lifetime of the handle", I really mean "the lifetime of the variable in which the handle is currently being stored": lifetimes are really for storage, not for values. This is typically why newcomers to Rust get tripped up when they can't work out why they can't do something like: <pre class="prettyprint"><code>fn thingy<'a>() -> (Box<i32>, &'a i32) { let x = box 1701; (x, &x) } </code></pre> "But... I know that the box will continue to live on, why does the compiler say it doesn't?!" Because Rust can't reason about heap lifetimes and must resort to tying the lifetime of <code>&x</code> to the variable <code>x</code>, not the heap allocation it happens to point to.

How does Rust achieve compile-time-only pointer safety?

Tags:

pointers

rust

memory-safety

I have read somewhere that in a language that features pointers, it is not possible for the compiler to decide fully at compile time whether all pointers are used correctly and/or are valid (refer to an alive object) for various reasons, since that would essentially constitute solving the halting problem. That is not surprising, intuitively, because in this case, we would be able to infer the runtime behavior of a program during compile-time, similarly to what's stated in this related question.

However, from what I can tell, the Rust language requires that pointer checking be done entirely at compile time (there's no undefined behavior related to pointers, "safe" pointers at least, and there's no "invalid pointer" or "null pointer" runtime exception either).

Assuming that the Rust compiler doesn't solve the halting problem, where does the fallacy lie?

Is it the case that pointer checking isn't done entirely at compile-time, and Rust's smart pointers still introduce some runtime overhead compared to, say, raw pointers in C?
Or is it possible that the Rust compiler can't make fully correct decisions, and it sometimes needs to Just Trust The Programmer™, probably using one of the lifetime annotations (the ones with the <'lifetime_ident> syntax)? In this case, does this mean that the pointer/memory safety guarantee is not 100%, and still relies on the programmer writing correct code?
Another possibility is that Rust pointers are non-"universal" or restricted in some sense, so that the compiler can infer their properties entirely during compile-time, but they are not as useful as e. g. raw pointers in C or smart pointers in C++.
Or maybe it is something completely different and I'm misinterpreting one or more of
{ "pointer", "safety", "guaranteed", "compile-time" }.

377

asked Apr 14 '15 13:04

The Paramagnetic Croissant

1 Answers

Disclaimer: I'm in a bit of a hurry, so this is a bit meandering. Feel free to clean it up.

The One Sneaky Trick That Language Designers Hate™ is basically this: Rust can only reason about the 'static lifetime (used for global variables and other whole-program lifetime things) and the lifetime of stack (i.e. local) variables: it cannot express or reason about the lifetime of heap allocations.

This means a few things. First of all, all of the library types that deal with heap allocations (i.e. Box<T>, Rc<T>, Arc<T>) all own the thing they point to. As a result, they don't actually need lifetimes in order to exist.

Where you do need lifetimes is when you're accessing the contents of a smart pointer. For example:

let mut x: Box<i32> = box 0;
*x = 42;

What is happening behind the scenes on that second line is this:

{
    let box_ref: &mut Box<i32> = &mut x;
    let heap_ref: &mut i32 = box_ref.deref_mut();
    *heap_ref = 42;
}

In other words, because Box isn't magic, we have to tell the compiler how to turn it into a regular, run of the mill borrowed pointer. This is what the Deref and DerefMut traits are for. This raises the question: what, exactly, is the lifetime of heap_ref?

The answer to this is in the definition of DerefMut (from memory because I'm in a hurry):

trait DerefMut {
    type Target;
    fn deref_mut<'a>(&'a mut self) -> &'a mut Target;
}

Like I said before, Rust absolutely cannot talk about "heap lifetimes". Instead, it has to tie the lifetime of the heap-allocated i32 to the only other lifetime it has on hand: the lifetime of the Box.

What this means is that "complicated" things don't have an expressible lifetime, and thus have to own the thing they manage. When you convert a complicated smart pointer/handle into a simple borrowed pointer, that is the moment that you have to introduce a lifetime, and you usually just use the lifetime of the handle itself.

Actually, I should clarify: by "lifetime of the handle", I really mean "the lifetime of the variable in which the handle is currently being stored": lifetimes are really for storage, not for values. This is typically why newcomers to Rust get tripped up when they can't work out why they can't do something like:

fn thingy<'a>() -> (Box<i32>, &'a i32) {
    let x = box 1701;
    (x, &x)
}

"But... I know that the box will continue to live on, why does the compiler say it doesn't?!" Because Rust can't reason about heap lifetimes and must resort to tying the lifetime of &x to the variable x, not the heap allocation it happens to point to.

answered Sep 21 '22 13:09

DK.

Related questions
                            
                                new vs *new in C++
                            
                                C++ difference between reference, objects and pointers
                            
                                C/C++: Pointer Arithmetic
                            
                                'restrict' keyword - Why is it allowed to assign from a outer restricted variable to an inner restricted variable?
                            
                                AppDomain address space
                            
                                what does this error suggest?
                            
                                Why does random extra code improve performance?
                            
                                gcc Strange -O0 code generation. Simple malloc. Pointer to multidimensional array
                            
                                Using Pointers Found in Cheat Engine in C#
                            
                                Reliably determine the number of elements in an array
                            
                                Does malloc return an "invalid pointer value" in C++17? [duplicate]
                            
                                Using iterators on arrays
                            
                                C++ Member Function Pointer Definition
                            
                                Should I always use size_t when indexing arrays?
                            
                                What does pointer reversal in mark and sweep garbage collection buy you?
                            
                                How to make a pointer increment by 1 byte, not 1 unit
                            
                                Reusing freed pointers in C
                            
                                Cout not printing number
                            
                                Replace object instance with another in C#
                            
                                What is a misaligned pointer ?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With