Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can a generic lifetime parameter in Rust be specialized to two disjoint lifetimes for one object?

Tags:

rust

lifetime

In the following piece of code, I am trying to understand how the generic lifetime parameter 'a is specialized.

struct Wrapper<'a>(&'a i32);

fn foo() {
    let mut r;

    {
        let x = 0;       // the lifetime of x, call it 'x, starts from here |
        r = Wrapper(&x); // 'a parameter in Wrapper is specialized to 'x    |
        drop(r);         //                                                 |
    }                    //  --------------------------------- 'x ends here |         

    {
        let y = 0;       // the lifetime of y, call it 'y, starts from here   |
                         // 'y is distinct from 'x                            |
                         // neither outlives the other one                    |
        r = Wrapper(&y); // why 'a parameter can again be specialized to 'y?  |
        drop(r);         //                                                   |
    }                    // ------------------------------------ 'y ends here |
}

Why is it possible that a generic lifetime parameter 'a can be specialized to two disjoint lifetimes 'x and 'y, for one object 'r?

From another point of view, I am very confused about the concrete type of r. From the subtyping and variance chapter in Rustonomicon, I understand that the lifetime 'a is part of the generic type Wrapper<'a>. When the generic gets specialized, it cannot be Wrapper<'x> nor Wrapper<'y>. Then what is the type of r?

Maybe higher-rank trait bound has something to do with it?

I would greatly appreciate if someone could explain how the Rust compiler internally reasons about this.

Update:

Exsiting answer suggests that the lifetime of r can have multiple starting and ending points. If this is true, then 'r is the union of 'x and 'y. However, this explanation does not fit well with the subtyping rule. For example, considering the following code, neither 'r2 nor 'r is the subtype of (or outlives) the other, so we should not be able to call bar(r, r2), but we actually can. Contradiction.

struct Wrapper<'a>(&'a i32);

fn bar<'a>(_p: Wrapper<'a>, _q: Wrapper<'a>) {}

fn foo() {
    let mut r;

    {
        let z = 0;
        let r2 = Wrapper(&z);               // -> |
        {                                   //    |
            let x = 0;       // -> |        //    +--'r2
            r = Wrapper(&x); //    |--'x--+ //    |
            bar(r, r2);      //    |      | // <- |
        }                    // <- |      |
    }                        //           |
    {                        //           +--'r
        let y = 0;           // -> |      |
        r = Wrapper(&y);     //    |--'y--+
        drop(r);             //    |
    }                        // <- |
}
like image 672
Zhiyao Avatar asked May 21 '21 06:05

Zhiyao


1 Answers

You are thinking about lifetime in the wrong way. The lifetime of r doesn't get specialized into neither 'x nor 'y. It has its own lifetime (lets call it 'r).

As explained in the Non-Lexical Lifetimes RFC, the lifetime of a value (or its scope as the RFC calls it) is the points on the program where the value may still be used in the future. Crucially, lifetimes aren't necessarily linear, nor do they have to fit to a specific syntax scope (i.e. a {} block). Additionally, lifetimes aren't connected to a variable (like r), but the values they are bound to. Therefore, if you assign to a variable (i.e. r = ..), you effectively killing one value and starting another. It doesn't count as a usage. However, assigning to a member of a value (i.e. r.0 = ..), is a usage, since you are changing a small part of an existing value.

In your case, 'r has two starting points and two end points. The first start point is at r = Wrapper(&x); and the first endpoint is at the first drop(r). The second start and end points are in the y-block. Visualized:

struct Wrapper<'a>(&'a i32);
fn foo() {
    let mut r;

    {
        let x = 0;       
        r = Wrapper(&x); // -> |
                         //    |--'x--+
        drop(r);         // <- |      |      
    }                    //           |
    {                    //           +--'r
        let y = 0;       //           |
        r = Wrapper(&y); // -> |      |
                         //    |--'y--+
        drop(r);         // <- |
    }
}

Here I also inserted 'x and 'y for reference. Note how 'r is specifically not live between the two scopes.

When borrow checking, the lifetime requirement is that lifetimes need to outlive each other. Lets say you have the bound 'a: 'b ('a must outlive 'b). For this bound to hold, anywhere 'b is live, 'a must be live too. This is what outlives means. Yet another way to say it: 'a lasts longer than 'b. However, here we also have to account for borrow checking being location aware. This means that 'a only needs to outlive 'b starting from the point their relation begins.

So lets look at our example. For a statement like r = Wrapper(&'x x) (where I have added the lifetime for reference) the borrow checker must ensure that the lifetime of the reference &'x x outlives the lifetime of r (i.e. 'x: 'r) beginning at r = Wrapper(&x). Looking at the lifetimes above, we can see that anywhere 'x is live, so is 'r. 'x dies when the scope ends, but luckily, so does 'r. So this checks out. Rinse and repeat for 'y.


To get more confident in this, lets look at some variations to the example and see how we can predict whether the compiler will accept it or reject it.

Example 1

We will simply removed the drop calls to see if it makes a difference:

struct Wrapper<'a>(&'a i32);
fn foo() {
    let mut r;

    {
        let x = 0;
        r = Wrapper(&x);
    }        

    {
        let y = 0;
        r = Wrapper(&y);
    }
}

This doesn't change any lifetimes. Remember, lifetimes are about whether values will be used in the future. Since we haven't added any uses of r, x, or y, no lifetimes change. And as expected, it compiles just fine.

Example 2

What about if x wasn't in a scope:

struct Wrapper<'a>(&'a i32);
fn foo() {
    let mut r;

    let x = 0;
    r = Wrapper(&x);
    
    {
        let y = 0;
        r = Wrapper(&y);
    }
}

Here 'r and 'x have moved into the parent scope, but otherwise they are both very similar to the original example. Likewise with 'y. Therefore all the bounds still hold, and it compiles. It would be the same for y.

Example 3

So lets change some uses of r. What if we instead of creating a new Wrapper for y, we simply overwrite the reference:

struct Wrapper<'a>(&'a i32);
fn foo() {
    let mut r;
    {
        let x = 0;
        r = Wrapper(&x);
    }
    {
        let y = 0;
        r.0 = &y;
    }
}

This materially changes 'r, since after the first scope it will be used in the second scope to be assigned a new reference. Therefore, 'r now spans from r = Wrapper(&x) to r.0 = &y. However, 'x cannot match this, as x wont live long enough (it goes out of scope). Therefore, we should get an error saying x doesn't live long enough, which we do.

Example 4

What about if we just use r after both scopes:

#[derive(Debug)]
struct Wrapper<'a>(&'a i32);
fn foo() {
    let mut r;
    {
        let x = 0;
        r = Wrapper(&x);
    }
    {
        let y = 0;
        r = Wrapper(&y);
    }
    println!("{:?}", r)
}

Here, 'r is now slightly larger than the original example, continuing after r = Wrapper(&y) to reach the print. However, it still is not live between the two blocks, since the value assigned in the first scope is never used in the second scope. Therefore, 'x outlives 'r. However, 'y now needs to be large enough to reach the print (since 'r does so), which it can't, since y will go out of scope. Therefore, the compiler should complain that y doesn't live long enough, but say nothing about x. Luckily, that is the case.


Regarding Subtyping

An update to the question asks why the above seems to contradict subtyping rules. Given a function fn some_fn<'a>(a: &'a i32, b: &'a i32), why can we pass two references to this function with different lifetimes. It seems that the function declaration requires the exact same lifetime in both cases.

The answer is that subtyping has specific rules for lifetimes:

Whenever references are copied from one location to another, the Rust subtyping rules require that the lifetime of the source reference outlives the lifetime of the target location.

This means rust never check for whether two lifetimes are equal. It only checks if they outlive each other the right way. In the update's example, 'a is the lifetime of the call to bar. When passing r and r2 to it, we get the bounds 'r: 'a and 'r2: 'a. This makes it very easy, since 'a is only live during the call, where the other lifetimes are also live. After the call, all three lifetime are dead and therefore the bound trivially holds.

like image 148
Emoun Avatar answered Sep 28 '22 22:09

Emoun