Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Encapsulating sequentially initialized state with self-references in Rust struct

Tags:

rust

I'm trying to define a struct that can act as an iterator for a Vec that is held within a RefCell:

use std::slice::Iter;
use std::cell::Ref;
use std::cell::RefCell;

struct HoldsVecInRefCell {
    vec_in_refcell: RefCell<Vec<i32>>,
}

// TODO: struct HoldsVecInRefCellIter implementing Iterator ...

impl HoldsVecInRefCell {
    fn new() -> HoldsVecInRefCell {
        HoldsVecInRefCell { vec_in_refcell: RefCell::new(Vec::new()) }
    }

    fn add_int(&self, i: i32) {
        self.vec_in_refcell.borrow_mut().push(i);
    }

    fn iter(&self) -> HoldsVecInRefCellIter {
        // TODO ...
    }
}

fn main() {
    let holds_vec = HoldsVecInRefCell::new();
    holds_vec.add_int(1);
    holds_vec.add_int(2);
    holds_vec.add_int(3);

    let mut vec_iter = holds_vec.iter();  // Under the hood: run-time borrow check

    for i in vec_iter {
        println!("{}", i);
    }
}

By comparison,vec_iter can be initialized in-line in main() as follows (deliberately verbose):

// Elided: lifetime parameter of Ref
let vec_ref: Ref<Vec<i32>> = holds_vec.vec_in_refcell.borrow();
// Elided: lifetime parameter of Iter
let mut vec_iter: Iter<i32> = vec_ref.iter();

Is there any way to define a struct implementing Iterator that holds both the Ref (to keep the immutable RefCell borrow alive) and the Iter (to maintain iterator state for next(), rather than rolling my own iterator for Vec or whatever other container), when the second is derived from (and holds a reference obtained from) the first?

I've tried several approaches to implementing this, and all run afoul of the borrow checker. If I put both pieces of state as bare struct members, like

struct HoldsVecInRefCellIter<'a> {
    vec_ref: Ref<'a, Vec<i32>>,
    vec_iter: Iter<'a, i32>,
}

then I can't initialize both fields at once with HoldsVecInRefCellIter { ... } syntax (see e.g., Does Rust have syntax for initializing a struct field with an earlier field?). If I try to shunt sequential initialization with a struct like

struct HoldsVecInRefCellIter<'a> {
    vec_ref: Ref<'a, Vec<i32>>,
    vec_iter: Option<Iter<'a, i32>>,
}

// ...

impl HoldsVecInRefCell {
    // ...

    fn iter(&self) -> HoldsVecInRefCellIter {
        let mut new_iter = HoldsVecInRefCellIter { vec_ref: self.vec_in_refcell.borrow(), vec_iter: None };
        new_iter.vec_iter = new_iter.vec_ref.iter();
        new_iter
    }
}

then I incur a mutable self-borrow of the struct that prevents returning it from iter(). This self-borrowing of a struct can also happen if you try to store a reference to one part of a struct in the struct itself (Why can't I store a value and a reference to that value in the same struct?), which would prevent safely moving instances of the struct. By comparison, it seems like a struct like HoldsVecInRefCellIter, if you could complete initialization, would do the correct thing when moved, since all references internally are to data elsewhere that outlives this struct.

There are tricks to avoid creating self-references using Rc (see examples at https://internals.rust-lang.org/t/self-referencing-structs/418/3), but I don't see how these could be applied if you want to store an existing Iterator struct which is implemented to hold a direct reference to the underlying container, not an Rc.

As a Rust newbie coming from C++, this feels like a problem that would come up often ("I have some complex state initialization logic in a block of code, and I want to abstract away that logic and hold the resulting state in a struct for use").

Related Question: Returning iterator of a Vec in a RefCell

like image 376
Daniel S. Avatar asked Sep 12 '16 00:09

Daniel S.


1 Answers

We'll have to cheat and lie about lifetimes.

use std::mem;

struct HoldsVecInRefCellIter<'a> {
    vec_ref: Ref<'a, Vec<i32>>,
    vec_iter: Iter<'a, i32>, // 'a is a lie!
}

impl HoldsVecInRefCell {
    fn iter(&self) -> HoldsVecInRefCellIter {
        unsafe {
            let vec_ref = self.vec_in_refcell.borrow();
            // transmute changes the lifetime parameter on the Iter
            let vec_iter = mem::transmute(vec_ref.iter());
            HoldsVecInRefCellIter { vec_ref: vec_ref, vec_iter: vec_iter }
        }
    }
}

impl<'a> Iterator for HoldsVecInRefCellIter<'a> {
    type Item = i32;

    fn next(&mut self) -> Option<Self::Item> {
        self.vec_iter.next().cloned()
    }
}

This only works because the Iter isn't invalidated by moving the Ref, as the Ref points to the Vec, and Iter points to the Vec's storage, not on the Ref itself.

However, this also enables you to move vec_iter out of the HoldsVecInRefCellIter; if you extract vec_iter and drop vec_ref, then the borrow would be released and the Iter could be invalidated without Rust giving a compiler error ('a is the RefCell's lifetime). With proper encapsulation, you can keep the struct's contents private and avoid users from performing this unsafe operation.

By the way, we could just as well define the iterator to return references:

impl<'a> Iterator for HoldsVecInRefCellIter<'a> {
    type Item = &'a i32;

    fn next(&mut self) -> Option<Self::Item> {
        self.vec_iter.next()
    }
}
like image 114
Francis Gagné Avatar answered Oct 20 '22 02:10

Francis Gagné