Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I have a property that takes a mutable reference that will outlive itself?

Tags:

rust

Basically I have this struct (unnecessary details elided):

struct Environment<'a> {
  parent: Option<&'a mut Environment<'a>>
}

This struct actually is not what I intended, because I wanted the parent reference to outlive the struct holding the reference.

I can imagine that this is possible with higher-ranked lifetime quantification, for example:

struct Environment {
  parent: <'a> Option<&'a mut Environment>
}

But obviously the code above don't work (at least prior to rustc 1.44).

The reason why I need this is because I'm trying to implement type checker that can process the following (example) code:

let x = 1  // parent environment
let f(y) = {
  // current environment
  x + y
}

So when I do a symbol lookup inside f, say x, I will first do a lookup at the current environment, if not found a recursive lookup will be done in parent environment. After typechecking is done inside f, the environment of f will be thrown away, but the parent environment should still retain so that the next statement can be type checked.

Hopefully this is clear enough to demonstrate why I needed the parent reference to outlive the struct holding it.

To recap the question: how to declare a property inside a struct that can hold a mutable reference of its own type that can outlive itself? If this is not possible what are the alternatives that I can look at?

like image 289
Wong Jia Hau Avatar asked Oct 26 '22 16:10

Wong Jia Hau


1 Answers

The declaration

struct Environment<'a> {
  parent: Option<&'a mut Environment<'a>>
}

does not mean that the parent cannot outlive the struct. In fact, lifetime parameters on structs will always end up referring to lifetimes that are longer than the struct itself's existence.

The problem you actually have is similar, but subtler: you wrote &'a mut Environment<'a>, and in this type

  • Environment<'a> means that the Environment's internal mutable reference is valid for 'a. This implies that the Environment must have been created while within the lifetime 'a.
  • &'a mut means that the reference is valid for exactly 'a which means that its referent must have been created before 'a.

These two requirements almost contradict each other, so they are practically impossible to satisfy (unless 'a: 'static). The general fix for this is to avoid over-constraining the lifetimes, by using separate lifetime parameters:

struct Environment<'r, 'e> {
  parent: Option<&'r mut Environment<'e, ???>>
}

However, this won't work in your case, because there's a deeper problem, hinted at by the <'e, ???>: by having a chain of &mut references, we're allowing the holder of those references to modify any part of any level of that chain, and in particular to replace those references with different references. But in order to do that, they would need to replace it with something that has the same lifetime — and the shape of this structure is a chain of nested longer-as-you-go-farther lifetimes, so for example, taking two Environments and mutating them to swap their positions is possible by the shape of the structure, but would violate the lifetimes.

Something that is possible, but not as flexible as you might want, is to use immutable references inside the structure. To demonstrate this, here's a simple scope checker:

use std::collections::HashSet;
enum Tree {
    Definition(String),
    Use(String),
    Block(Vec<Tree>),
}

struct Environment<'r> {
    parent: Option<&'r Environment<'r>>,
    symbols: HashSet<String>,
}

impl<'r> Environment<'r> {
    fn contains(&self, x: &str) -> bool {
        self.symbols.contains(x) || self.parent.filter(|p| p.contains(x)).is_some()
    }
}

fn empty() -> Environment<'static> {
    Environment {
        parent: None,
        symbols: HashSet::new(),
    }
}

fn check<'r>(env: &'r mut Environment<'_>, tree: &Tree) -> Result<(), String> {
    use Tree::*;
    match tree {
        Definition(symbol) => {
            env.symbols.insert(symbol.clone());
            Ok(())
        }
        Use(symbol) => {
            if !env.contains(symbol) {
                return Err(symbol.clone());
            }
            Ok(())
        }
        Block(statements) => {
            let mut block_env = Environment {
                parent: Some(env),
                symbols: HashSet::new(),
            };
            for statement in statements.iter() {
                check(&mut block_env, statement)?;
            }
            Ok(())
        }
    }
}

#[test]
fn tests() {
    use Tree::*;
    assert_eq!(
        check(
            &mut empty(),
            &Block(vec![Definition("x".to_owned()), Use("x".to_owned())])
        ),
        Ok(())
    );
    assert_eq!(
        check(
            &mut empty(),
            &Block(vec![Definition("x".to_owned()), Use("y".to_owned())])
        ),
        Err("y".to_owned())
    );
    assert_eq!(
        check(
            &mut empty(),
            &Block(vec![
                Definition("x".to_owned()),
                Block(vec![Use("x".to_owned())])
            ])
        ),
        Ok(())
    );
}

Note that the declaration

struct Environment<'r> {
    parent: Option<&'r Environment<'r>>,
    symbols: HashSet<String>,
}

contains Option<&'r Environment<'r>>, which is exactly what I told you not to do earlier. The reason that Environment<'r> doesn't over-constrain the lifetimes is that the type of the referent of an immutable reference is covariant — this means that if we want an Environment<'r> we can accept an Environment<'e> as long as 'e outlives 'r. Mutable references, on the other hand, require invariance: they must exactly match. (This is because with an immutable reference, data can only flow out of it, but given a mutable reference, data can flow either in or out, so it would be unsound if there was a lifetime mismatch in either direction.)

The caveat with what I've written above is that it is not possible to mutate environments farther up the chain, which you might want if you're performing type inference (so that uses can determine the types of declarations). I'm not entirely certain, but I think that in order to do that, you'll need to resort to interior mutability to let you mutate something even without a mutable reference.

Here's the above example modified to use the interior mutability tool std::cell::RefCell. Note that this code does not actually use the additional flexibility — but it's there; you can modify the symbols of any parent by using the explicit runtime-checked .symbols.borrow_mut() operation.

use std::cell::RefCell;
use std::collections::HashSet;

enum Tree {
    Definition(String),
    Use(String),
    Block(Vec<Tree>),
}

struct Environment<'r> {
    parent: Option<&'r Environment<'r>>,
    symbols: RefCell<HashSet<String>>,
}

impl<'r> Environment<'r> {
    fn contains(&self, x: &str) -> bool {
        self.symbols.borrow().contains(x) || self.parent.filter(|p| p.contains(x)).is_some()
    }
}

fn empty() -> Environment<'static> {
    Environment {
        parent: None,
        symbols: RefCell::new(HashSet::new()),
    }
}

fn check<'r>(env: &'r Environment<'_>, tree: &Tree) -> Result<(), String> {
    use Tree::*;
    match tree {
        Definition(symbol) => {
            env.symbols.borrow_mut().insert(symbol.clone());
            Ok(())
        }
        Use(symbol) => {
            if !env.contains(symbol) {
                return Err(symbol.clone());
            }
            Ok(())
        }
        Block(statements) => {
            let block_env = Environment {
                parent: Some(env),
                symbols: RefCell::new(HashSet::new()),
            };
            for statement in statements.iter() {
                check(&block_env, statement)?;
            }
            Ok(())
        }
    }
}

#[test]
fn tests() {
    use Tree::*;
    assert_eq!(
        check(
            &mut empty(),
            &Block(vec![Definition("x".to_owned()), Use("x".to_owned())])
        ),
        Ok(())
    );
    assert_eq!(
        check(
            &mut empty(),
            &Block(vec![Definition("x".to_owned()), Use("y".to_owned())])
        ),
        Err("y".to_owned())
    );
    assert_eq!(
        check(
            &mut empty(),
            &Block(vec![
                Definition("x".to_owned()),
                Block(vec![Use("x".to_owned())])
            ])
        ),
        Ok(())
    );
}
like image 88
Kevin Reid Avatar answered Dec 20 '22 01:12

Kevin Reid