Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need clarification on the Rust Nomicon section on (co)variance of `Box`, `Vec` and other collections

The Rust Nomicon has an entire section on variance which I more or less understand except this little section in regards to Box<T> and Vec<T> being (co)variant over T.

Box and Vec are interesting cases because they're variant, but you can definitely store values in them! This is where Rust gets really clever: it's fine for them to be variant because you can only store values in them via a mutable reference! The mutable reference makes the whole type invariant, and therefore prevents you from smuggling a short-lived type into them.

What confuses me is the following line:

it's fine for them to be variant because you can only store values in them via a mutable reference!

My first question is that I'm slightly confused as to what the mutable reference is to. Is it a mutable reference to the Box / Vec?

If so, how does the fact that I can only store values in them via a mutable reference justify their (co)variance? I understand what (co)variance is and the benefits of having it for Box<T>, Vec<T> etc., but I am struggling to see the link between only being able to store values via mutable references and the justification of (co)variance.

Also, when we initialize a Box, aren't values moved into the box without involving an mutable reference? Doesn't this contradict the statement that we can only store values in them via mutable reference?

And finally, under what context is this 'mutable reference' borrowed? Do they mean that when you call methods that modify the Box or Vec you implicitly take an &mut self? Is that the mutable reference mentioned?


Update 2nd May 2018:

Since I have yet to receive a satisfactory answer to this question, I take it that the nomicon's explanation is genuinely confusing. So as promised in a comment thread below, I have opened an issue in the Rust Nomicon repository. You can track any updates there.

like image 739
L.Y. Sim Avatar asked Apr 24 '18 08:04

L.Y. Sim


3 Answers

I think that section could use some work to make it clearer.

I'm slightly confused as to what the mutable reference is to. Is it a mutable reference to the Box / Vec?

No. It means, if you store values in an existing Box, you'd have to do that via a mutable reference to the data, for example using Box::borrow_mut().

The main idea this section is trying to convey is that you can't modify the contents of a Box while there is another reference to the contents. That's guaranteed because the Box owns its contents. In order to change the contents of a Box, you have to do it by taking a new mutable reference.

This means that — even if you did overwrite the contents with a shorter-lived value — it wouldn't matter because no one else could be using the old value. The borrow checker wouldn't allow it.

This is different from function arguments because a function has a code block which can actually do things with its arguments. In the case of a Box or Vec, you have to get the contents out, by mutably borrowing them, before you can do anything to them.

like image 64
Peter Hall Avatar answered Sep 29 '22 13:09

Peter Hall


From the nomicom:

Box and Vec are interesting cases because they're variant, but you can definitely store values in them! This is where Rust gets really clever: it's fine for them to be variant because you can only store values in them via a mutable reference! The mutable reference makes the whole type invariant, and therefore prevents you from smuggling a short-lived type into them.

Consider Vec method to add a value:

pub fn push(&'a mut self, value: T)

The type of self is &'a mut Vec<T> and I understand that this is the mutable reference nomicom is speaking about, so instantiating for the Vec case the last sentence of the above phrase become:

The type &'a mut Vec<T> is invariant, and therefore prevents you from smuggling a short-lived type into Vec<T>.

The same reasoning holds for Box.

Said in another way: the values contained by Vec and Box always outlive their container despite Vec and Box being variant because you can only store values in them via a mutable reference.

Consider the following snippet:

fn main() {
    let mut v: Vec<&String> = Vec::new();

    {
        let mut a_value = "hola".to_string();

        //v.push(a_ref);
        Vec::push(&mut v, &mut a_value);
    }

    // nomicom is saing that if &mut self Type was variant here we have had
    // a vector containing a reference pointing to freed memory

    // but this is not the case and the compiler throws an error
}

It should help to note similarity of Vec::push(&mut v, &mut a_value) with overwrite(&mut forever_str, &mut &*string) from the nomicom example.

like image 43
attdona Avatar answered Sep 29 '22 11:09

attdona


Since opening the issue in the Nomicon repo, the maintainers have introduced a revision to the section which I feel is considerably clearer. The revision has been merged. I consider my question answered by the revision.

Below I provide a brief summary of what I know.

The part that relates to my question now reads as follows (emphasis mine):

Box and Vec are interesting cases because they're covariant, but you can definitely store values in them! This is where Rust's typesystem allows it to be a bit more clever than others. To understand why it's sound for owning containers to be covariant over their contents, we must consider the two ways in which a mutation may occur: by-value or by-reference.

If mutation is by-value, then the old location that remembers extra details is moved out of, meaning it can't use the value anymore. So we simply don't need to worry about anyone remembering dangerous details. Put another way, applying subtyping when passing by-value destroys details forever. For example, this compiles and is fine:

 fn get_box<'a>(str: &'a str) -> Box<&'a str> {
     // String literals are `&'static str`s, but it's fine for us to
     // "forget" this and let the caller think the string won't live that long.
     Box::new("hello") }

If mutation is by-reference, then our container is passed as &mut Vec<T>. But &mut is invariant over its value, so &mut Vec<T> is actually invariant over T. So the fact that Vec<T> is covariant over T doesn't matter at all when mutating by-reference.

The key point here really is the parallel between the invariance of &mut Vec<T> over T and the invariance &mut T over T.

It was explained earlier in the revised nomicon section why a general &mut T cannot be covariant over T. &mut T borrows T, but it doesn't own T, meaning that there are other things that refer to T and have a certain expectation of its lifetime.

But if we were allowed to pass &mut T covariant over T, then the overwrite function in the nomicon's example shows how we can break the lifetime of T in the caller's location from a different location (i.e. within the body of overwrite).

In a sense, allowing covariance over T for a type constructor allows us to 'forget the original lifetime of T' when passing the type constructor, and this 'forgetting the original lifetime of T' is ok for &T because there is no chance of us modifying T through it, but it's dangerous when we have an &mut T because we have the ability to modify T after forgetting lifetime details about it. This is why &mut T needs to be invariant over T.

It seems the point the nomicon is trying to make is: it's OK for Box<T> to be covariant over T because it does not introduce unsafeness.

One of the consequences of this covariance is that we are allowed to 'forget the original lifetime of T' when passing Box<T> by value. But this does not introduce unsafeness because when we pass by value, we guaranteeing that there are no further users of T in the location that Box<T> was moved from. No one else in the old location is counting on the previous lifetime of T to remain so after the move.

But more importantly, Box<T> being covariant over T does not introduce unsafeness when it comes to taking a mutable reference to the Box<T>, because &mut Box<T> is invariant over Box<T> and therefore invariant over T. So, similar to the &mut T discussion above, we are unable to perform lifetime shenanigans through an &mut Box<T> by forgetting lifetime details about T and then modifying it after.

like image 45
L.Y. Sim Avatar answered Sep 29 '22 13:09

L.Y. Sim