Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Rust achieve compile-time-only pointer safety?

I have read somewhere that in a language that features pointers, it is not possible for the compiler to decide fully at compile time whether all pointers are used correctly and/or are valid (refer to an alive object) for various reasons, since that would essentially constitute solving the halting problem. That is not surprising, intuitively, because in this case, we would be able to infer the runtime behavior of a program during compile-time, similarly to what's stated in this related question.

However, from what I can tell, the Rust language requires that pointer checking be done entirely at compile time (there's no undefined behavior related to pointers, "safe" pointers at least, and there's no "invalid pointer" or "null pointer" runtime exception either).

Assuming that the Rust compiler doesn't solve the halting problem, where does the fallacy lie?

  • Is it the case that pointer checking isn't done entirely at compile-time, and Rust's smart pointers still introduce some runtime overhead compared to, say, raw pointers in C?
  • Or is it possible that the Rust compiler can't make fully correct decisions, and it sometimes needs to Just Trust The Programmer™, probably using one of the lifetime annotations (the ones with the <'lifetime_ident> syntax)? In this case, does this mean that the pointer/memory safety guarantee is not 100%, and still relies on the programmer writing correct code?
  • Another possibility is that Rust pointers are non-"universal" or restricted in some sense, so that the compiler can infer their properties entirely during compile-time, but they are not as useful as e. g. raw pointers in C or smart pointers in C++.
  • Or maybe it is something completely different and I'm misinterpreting one or more of
    { "pointer", "safety", "guaranteed", "compile-time" }.
like image 377
The Paramagnetic Croissant Avatar asked Apr 14 '15 13:04

The Paramagnetic Croissant


People also ask

Is Rust a pointer reference?

Rust references are just a pointer, but the compiler endows them with borrowing semantics. When you take an immutable reference to an object, the compiler ensures that you can't modify that object until after the derived reference is gone.

Does Rust use smart pointers?

Rust has a variety of smart pointers defined in the standard library that provide functionality beyond that provided by references. To explore the general concept, we'll look at a couple of different examples of smart pointers, including a reference counting smart pointer type.

Should I use pointers in Rust?

Rust has a number of different smart pointer types in its standard library, but there are two types that are extra-special. Much of Rust's safety comes from compile-time checks, but raw pointers don't have such guarantees, and are unsafe to use.

What does unsafe Rust allow?

The following discussion on Rust Internals has more in-depth explanations about this but here is a summary of the main points: unsafe fn : calling this function means abiding by a contract the compiler cannot enforce. unsafe trait : implementing the trait means abiding by a contract the compiler cannot enforce.


1 Answers

Disclaimer: I'm in a bit of a hurry, so this is a bit meandering. Feel free to clean it up.

The One Sneaky Trick That Language Designers Hate™ is basically this: Rust can only reason about the 'static lifetime (used for global variables and other whole-program lifetime things) and the lifetime of stack (i.e. local) variables: it cannot express or reason about the lifetime of heap allocations.

This means a few things. First of all, all of the library types that deal with heap allocations (i.e. Box<T>, Rc<T>, Arc<T>) all own the thing they point to. As a result, they don't actually need lifetimes in order to exist.

Where you do need lifetimes is when you're accessing the contents of a smart pointer. For example:

let mut x: Box<i32> = box 0;
*x = 42;

What is happening behind the scenes on that second line is this:

{
    let box_ref: &mut Box<i32> = &mut x;
    let heap_ref: &mut i32 = box_ref.deref_mut();
    *heap_ref = 42;
}

In other words, because Box isn't magic, we have to tell the compiler how to turn it into a regular, run of the mill borrowed pointer. This is what the Deref and DerefMut traits are for. This raises the question: what, exactly, is the lifetime of heap_ref?

The answer to this is in the definition of DerefMut (from memory because I'm in a hurry):

trait DerefMut {
    type Target;
    fn deref_mut<'a>(&'a mut self) -> &'a mut Target;
}

Like I said before, Rust absolutely cannot talk about "heap lifetimes". Instead, it has to tie the lifetime of the heap-allocated i32 to the only other lifetime it has on hand: the lifetime of the Box.

What this means is that "complicated" things don't have an expressible lifetime, and thus have to own the thing they manage. When you convert a complicated smart pointer/handle into a simple borrowed pointer, that is the moment that you have to introduce a lifetime, and you usually just use the lifetime of the handle itself.

Actually, I should clarify: by "lifetime of the handle", I really mean "the lifetime of the variable in which the handle is currently being stored": lifetimes are really for storage, not for values. This is typically why newcomers to Rust get tripped up when they can't work out why they can't do something like:

fn thingy<'a>() -> (Box<i32>, &'a i32) {
    let x = box 1701;
    (x, &x)
}

"But... I know that the box will continue to live on, why does the compiler say it doesn't?!" Because Rust can't reason about heap lifetimes and must resort to tying the lifetime of &x to the variable x, not the heap allocation it happens to point to.

like image 79
DK. Avatar answered Sep 21 '22 13:09

DK.