Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Am I incorrectly implementing IntoIterator for a reference to a LazyList implementation or is this a Rust bug?

In implementing a version of a LazyList (an immutable lazily-computed memoized singly-linked list, much as Haskell lists), I have run into a problem of implementing IntoIterator in that the code does not drop the reference when I think it should. The following code has been simplified so as just to show the problem; thus, is not generic and does not include all of the methods not related to implementing IntoIterator:

use std::cell::UnsafeCell;
use std::mem::replace;
use std::rc::Rc;

// only necessary because Box<FnOnce() -> R> doesn't yet work...
trait Invoke<R = ()> {
    fn invoke(self: Box<Self>) -> R;
}

impl<'a, R, F: 'a + FnOnce() -> R> Invoke<R> for F {
    #[inline(always)]
    fn invoke(self: Box<F>) -> R {
        (*self)()
    }
}

// not thread safe
struct Lazy<'a, T: 'a>(UnsafeCell<LazyState<'a, T>>);

enum LazyState<'a, T: 'a> {
    Unevaluated(Box<Invoke<T> + 'a>),
    EvaluationInProgress,
    Evaluated(T),
}

use self::LazyState::*;

impl<'a, T: 'a> Lazy<'a, T> {
    #[inline]
    fn new<F: 'a + FnOnce() -> T>(func: F) -> Lazy<'a, T> {
        Lazy(UnsafeCell::new(Unevaluated(Box::new(func))))
    }
    #[inline]
    pub fn evaluated(val: T) -> Lazy<'a, T> {
        Lazy(UnsafeCell::new(Evaluated(val)))
    }
    #[inline]
    fn value(&'a self) -> &'a T {
        unsafe {
            match *self.0.get() {
                Evaluated(_) => (), // nothing required; already Evaluated
                EvaluationInProgress => panic!("Lazy::force called recursively!!!"),
                _ => {
                    let ue = replace(&mut *self.0.get(), EvaluationInProgress);
                    if let Unevaluated(thnk) = ue {
                        *self.0.get() = Evaluated(thnk.invoke());
                    } // no other possiblity!
                }
            } // following just gets evaluated, no other state possible
            if let Evaluated(ref v) = *self.0.get() {
                return v;
            } else {
                unreachable!();
            }
        }
    }
}

enum LazyList<'a> {
    Empty,
    Cons(i32, RcLazyListNode<'a>),
}

type RcLazyListNode<'a> = Rc<Lazy<'a, LazyList<'a>>>;

impl<'a> LazyList<'a> {
    fn iter(&self) -> Iter<'a> {
        Iter(self)
    }
}

struct Iter<'a>(*const LazyList<'a>);

impl<'a> Iterator for Iter<'a> {
    type Item = &'a i32;

    fn next(&mut self) -> Option<Self::Item> {
        unsafe {
            if let LazyList::Cons(ref v, ref r) = *self.0 {
                self.0 = r.value();
                Some(v)
            } else {
                None
            }
        }
    }
}

impl<'a> IntoIterator for &'a LazyList<'a> {
    type Item = &'a i32;
    type IntoIter = Iter<'a>;

    fn into_iter(self) -> Self::IntoIter {
        self.iter()
    }
}

fn main() {
    let test2 = LazyList::Cons(2, Rc::new(Lazy::evaluated(LazyList::Empty)));
    let test = LazyList::Cons(1, Rc::new(Lazy::new(move || test2)));
    // let itr = Iter(&test); // works
    // let itr = (&test).iter(); // works
    let itr = IntoIterator::into_iter(&test); // not working
    for v in itr {
        println!("{}", v);
    }
}

The above code fails with:

rustc 1.13.0 (2c6933acc 2016-11-07)
error: `test` does not live long enough
   --> <anon>:103:40
    |
103 |     let itr = IntoIterator::into_iter(&test); // not working
    |                                        ^^^^ does not live long enough
...
107 | }
    | - borrowed value dropped before borrower
    |
    = note: values in a scope are dropped in the opposite order they are created

As noted in the comments in main(), the code is usable except when called as a reference through the IntoIterator trait. This may be a bug in implementing traits for references where the ownership of the returned iterator containing a pointer is not transferred to the same scope as the call to IntoIterator::into_iterbut rather to the 'static lifetime, thus, it is not dropped when expected.

How do I implement this, if possible? I've tried adding a std::marker::PhantomData<> marker field to the Iter struct but it seems that, too, is assigned a 'static lifetime.

like image 549
GordonBGood Avatar asked Feb 05 '23 16:02

GordonBGood


2 Answers

When you implemented IntoIterator, you unified the lifetimes between the reference to the list and the items that the list contains:

impl<'a> IntoIterator for &'a LazyList<'a>

That requires that 'a must be the shorter of the lifetimes. That's not useful in this case. Instead, you need to have two distinct lifetimes:

impl<'l, 'i> IntoIterator for &'l LazyList<'i> {
    type Item = &'i i32;
    type IntoIter = Iter<'i>;

    fn into_iter(self) -> Self::IntoIter {
        self.iter()
    }
}
like image 112
Shepmaster Avatar answered Feb 18 '23 02:02

Shepmaster


For those who find this question by searches for Rust, Lazy, and LazyList, I here post the final generic working code for Lazy and LazyList with both non and thread-safe versions working with current stable Rust version 1.56.1.

The code contains some methods not actually used by the included test code, and especially the unwrap() methods are useless here as we can't consume a type embedded in another type (unless we replace an interior mutable value); a few more tests would need to be devised for the singleton() method, the unwrap() methods, and the tail()method.

Because we can't generally unwrap, the embedded type must be Clone; this costs some performance in the copy operations involved, so when types are large (copy-wise), one may want to wrap them in an Rc for faster reference-counted cloning.

The code is as follows:

pub struct Thunk<'a, R>(Box<dyn FnOnce() -> R + 'a>);
 
impl<'a, R: 'a> Thunk<'a, R> {
    #[inline(always)]
    fn new<F: 'a + FnOnce() -> R>(func: F) -> Thunk<'a, R> {
        Thunk(Box::new(func))
    }
    #[inline(always)]
    fn invoke(self) -> R { self.0() }
}
 
// Lazy is lazily evaluated contained value using the above Thunk implementation...
mod lazy {
    use crate::Thunk;
    use std::cell::UnsafeCell;
    use std::mem::replace;
    use std::ops::Deref;

    // Lazy is lazily evaluated contained value using the above Thunk implementation...
    pub struct Lazy<'a, T: 'a>(UnsafeCell<LazyState<'a, T>>);

    enum LazyState<'a, T: 'a> {
        Unevaluated(Thunk<'a, T>),
        EvaluationInProgress,
        Evaluated(T),
    }

    use self::LazyState::*;

    impl<'a, T: 'a> Lazy<'a, T> {
        #[inline]
        pub fn new<F: 'a + FnOnce() -> T>(func: F) -> Lazy<'a, T> {
            Lazy(UnsafeCell::new(Unevaluated(Thunk::new(func))))
        }
        #[inline]
        pub fn evaluated(val: T) -> Lazy<'a, T> {
            Lazy(UnsafeCell::new(Evaluated(val)))
        }
        #[inline(always)]
        fn force<'b>(&'b self) {
            unsafe {
                match *self.0.get() {
                    Evaluated(_) => return, // nothing required; already Evaluated
                    EvaluationInProgress => panic!("Lazy::force called recursively!!!"),
                    _ => {
                        let ue = replace(&mut *self.0.get(), EvaluationInProgress);
                        if let Unevaluated(thnk) = ue {
                            *self.0.get() = Evaluated(thnk.invoke());
                        } // no other possiblity!
                    }
                }
            }
        }
        #[inline]
        pub fn unwrap<'b>(self) -> T where T: 'b { // consumes the object to produce the value
            self.force(); // evaluatate if not evealutated
            match self.0.into_inner() {
                Evaluated(v) => v,
                _ => unreachable!() // previous code guarantees never not Evaluated
            }
        }
    }

    impl<'a, T: 'a> Deref for Lazy<'a, T> {
        type Target = T;
        #[inline]
        fn deref<'b>(&'b self) -> &'b T {
            self.force(); // evaluatate if not evalutated
            match unsafe { &*self.0.get() } {
                &Evaluated(ref v) => return v,
                _ => unreachable!(),
            }
        }
    }
}

mod lazy_sync {
    use crate::Thunk;
    use std::cell::UnsafeCell;
    use std::mem::replace;
    use std::sync::Mutex;
    use std::sync::atomic::AtomicBool;
    use std::sync::atomic::Ordering::Relaxed;
    use std::ops::Deref;

    pub struct Lazy<'a, T: 'a + Send + Sync>(
        UnsafeCell<LazyState<'a, T>>, AtomicBool, Mutex<()>);

    enum LazyState<'a, T: 'a + Send + Sync> {
        Unevaluated(Thunk<'a, T>),
        EvaluationInProgress,
        Evaluated(T),
    }

    use self::LazyState::*;

    unsafe impl<'a, T: 'a + Send + Sync> Send for Lazy<'a, T> {}
    unsafe impl<'a, T: 'a + Send + Sync> Sync for Lazy<'a, T> {}

    impl<'a, T: 'a + Send + Sync> Lazy<'a, T> {
        #[inline]
        pub fn new<F: 'a + FnOnce() -> T>(func: F) -> Lazy<'a, T> {
            Lazy(UnsafeCell::new(Unevaluated(Thunk::new(func))),
               AtomicBool::new(false), Mutex::new(()))
        }
        #[inline]
        pub fn evaluated(val: T) -> Lazy<'a, T> {
            Lazy(UnsafeCell::new(Evaluated(val)),
               AtomicBool::new(true), Mutex::new(()))
        }
        #[inline(always)]
        fn force<'b>(&'b self) {
            unsafe {
            if !self.1.load(Relaxed) {
              let _ = self.2.lock();
              // if we don't get the false below, means
              // another thread already handled the thunk,
              // including setting to true, still processing when checked
              if !self.1.load(Relaxed) {
                    match *self.0.get() {
                        Evaluated(_) => return, // nothing required; already Evaluated
                        EvaluationInProgress => unreachable!(), // because lock race recursive evals...
                        _ => {
                            if let Unevaluated(thnk) = replace(&mut *self.0.get(), EvaluationInProgress) {
                                *self.0.get() = Evaluated(thnk.invoke());
                            } // no other possiblity!
                        }
                    }
                  self.1.store(true, Relaxed);
                }
            }
          }
        }

        #[inline]
        pub fn unwrap<'b>(self) -> T where T: 'b { // consumes the object to produce the value
            self.force(); // evaluatate if not evealutated
            match self.0.into_inner() {
                Evaluated(v) => v,
                _ => unreachable!() // previous code guarantees never not Evaluated
            }
        }
    }

    impl<'a, T: 'a + Send + Sync> Deref for Lazy<'a, T> {
        type Target = T;
        #[inline]
        fn deref<'b>(&'b self) -> &'b T {
          self.force(); // evaluatate if not evalutated
              match unsafe { &*self.0.get() } {
                  &Evaluated(ref v) => return v,
                  _ => unreachable!(),
              }
        }
    }
}

// LazyList is an immutable lazily-evaluated persistent (memoized) singly-linked list
// similar to lists in Haskell, although here only tails are lazy...
//   depends on the contained type being Clone so that the LazyList can be
//   extracted from the reference-counted Rc heap objects in which embedded.
mod lazylist {
    use crate::lazy::Lazy;
    use std::rc::Rc;
    use std::mem::{replace, swap};

    #[derive(Clone)]
    pub enum LazyList<'a, T: 'a + Clone> {
        Empty,
        Cons(T, RcLazyListNode<'a, T>),
    }

    pub use self::LazyList::Empty;
    use self::LazyList::Cons;

    type RcLazyListNode<'a, T> = Rc<Lazy<'a, LazyList<'a, T>>>;

//  impl<'a, T:'a> !Sync for LazyList<'a, T> {}

    impl<'a, T: 'a + Clone> LazyList<'a, T> {
        #[inline]
        pub fn singleton(v: T) -> LazyList<'a, T> {
            Cons(v, Rc::new(Lazy::evaluated(Empty)))
        }
        #[inline]
        pub fn cons<F>(v: T, cntf: F) -> LazyList<'a, T>
            where F: 'a + FnOnce() -> LazyList<'a, T>
        {
            Cons(v, Rc::new(Lazy::new(cntf)))
        }
        #[inline]
        pub fn head<'b>(&'b self) -> &'b T {
            if let Cons(ref hd, _) = *self {
                return hd;
            }
            panic!("LazyList::head called on an Empty LazyList!!!")
        }
        #[inline]
        pub fn tail<'b>(&'b self) -> &'b Lazy<'a, LazyList<'a, T>> {
            if let Cons(_, ref rlln) = *self {
                return &*rlln;
            }
            panic!("LazyList::tail called on an Empty LazyList!!!")
        }
        #[inline]
        pub fn unwrap(self) -> (T, RcLazyListNode<'a, T>) {
            // consumes the object
            if let Cons(hd, rlln) = self {
                return (hd, rlln);
            }
            panic!("LazyList::unwrap called on an Empty LazyList!!!")
        }
        #[inline]
        fn iter(&self) -> Iter<'a, T> {
            Iter(self)
        }
        pub fn from_iter<I: IntoIterator<Item = T> + 'a>(itrbl: I) -> LazyList<'a, T> {
            let itr = itrbl.into_iter();
            #[inline(always)]
            fn next_iter<'b, R, Itr>(mut iter: Itr) -> LazyList<'b, R>
                where R: 'b + Clone,
                      Itr: 'b + Iterator<Item = R>
            {
                match iter.next() {
                    Some(val) => LazyList::cons(val, move || next_iter(iter)),
                    None => Empty,
                }
            }
            next_iter(itr)
        }

    }

    impl<'a, T: 'a + Clone> Iterator for LazyList<'a, T> {
        type Item = T;

        fn next(&mut self) -> Option<Self::Item> {
            match replace(self, Empty) {
                Cons(hd, rlln) => {
                    let mut newll = (*rlln).clone();
                    swap(self, &mut newll); // self now contains tail, newll contains the Empty
                    Some(hd)
                }
                _ => None,
            }
        }
    }

    pub struct Iter<'a, T: 'a + Clone>(*const LazyList<'a, T>);

    impl<'a, T: 'a + Clone> Iterator for Iter<'a, T> {
        type Item = &'a T;

        fn next(&mut self) -> Option<Self::Item> {
            unsafe {
                if let LazyList::Cons(ref v, ref r) = *self.0 {
                    self.0 = &***r;
                    Some(v)
                } else {
                    None
                }
            }
        }
    }

    impl<'i, 'l, T: 'i + Clone> IntoIterator for &'l LazyList<'i, T> {
        type Item = &'i T;
        type IntoIter = Iter<'i, T>;

        fn into_iter(self) -> Self::IntoIter {
            self.iter()
        }
    }

/*
    impl<'a, T: 'a + Clone> FromIterator<T> for LazyList<'a, T> {
        fn from_iter<I: IntoIterator<Item = T> + 'a>(itrbl: I) -> LazyList<'a, T> {
            let itr = itrbl.into_iter();
            #[inline(always)]
            fn next_iter<'b, R, Itr>(mut iter: Itr) -> LazyList<'b, R>
                where R: 'b + Clone,
                      Itr: 'b + Iterator<Item = R>
            {
                match iter.next() {
                    Some(val) => LazyList::cons(val, move || next_iter(iter)),
                    None => Empty,
                }
            }
            next_iter(itr)
        }
    }
*/

}

mod lazylist_sync {
    use crate::lazy_sync::Lazy;
    use std::sync::Arc as Rc;
    use std::mem::{replace, swap};

    #[derive(Clone)]
    pub enum LazyList<'a, T: 'a + Send + Sync + Clone> {
        Empty,
        Cons(T, RcLazyListNode<'a, T>),
    }

    pub use self::LazyList::Empty;
    use self::LazyList::Cons;

    type RcLazyListNode<'a, T> = Rc<Lazy<'a, LazyList<'a, T>>>;

    unsafe impl<'a, T: 'a + Send + Sync + Clone> Send for LazyList<'a, T> {}
    unsafe impl<'a, T: 'a + Send + Sync + Clone> Sync for LazyList<'a, T> {}

    impl<'a, T: 'a + Send + Sync + Clone> LazyList<'a, T> {
        #[inline]
        pub fn singleton(v: T) -> LazyList<'a, T> {
            Cons(v, Rc::new(Lazy::evaluated(Empty)))
        }
        #[inline]
        pub fn cons<F>(v: T, cntf: F) -> LazyList<'a, T>
            where F: 'a + FnOnce() -> LazyList<'a, T>
        {
            Cons(v, Rc::new(Lazy::new(cntf)))
        }
        #[inline]
        pub fn head<'b>(&'b self) -> &'b T {
            if let Cons(ref hd, _) = *self {
                return hd;
            }
            panic!("LazyList::head called on an Empty LazyList!!!")
        }
        #[inline]
        pub fn tail<'b>(&'b self) -> &'b Lazy<'a, LazyList<'a, T>> {
            if let Cons(_, ref rlln) = *self {
                return &*rlln;
            }
            panic!("LazyList::tail called on an Empty LazyList!!!")
        }
        #[inline]
        pub fn unwrap(self) -> (T, RcLazyListNode<'a, T>) {
            // consumes the object
            if let Cons(hd, rlln) = self {
                return (hd, rlln);
            }
            panic!("LazyList::unwrap called on an Empty LazyList!!!")
        }
        #[inline]
        fn iter(&self) -> Iter<'a, T> {
            Iter(self)
        }
        pub fn from_iter<I: IntoIterator<Item = T> + 'a>(itrbl: I) -> LazyList<'a, T> {
            let itr = itrbl.into_iter();
            #[inline(always)]
            fn next_iter<'b, R: 'b + Send + Sync, Itr>(mut iter: Itr) -> LazyList<'b, R>
                where R: 'b + Clone,
                      Itr: 'b + Iterator<Item = R>
            {
                match iter.next() {
                    Some(val) => LazyList::cons(val, move || next_iter(iter)),
                    None => Empty,
                }
            }
            next_iter(itr)
        }
    }

    impl<'a, T: 'a + Send + Sync + Clone> Iterator for LazyList<'a, T> {
        type Item = T;

        fn next(&mut self) -> Option<Self::Item> {
            match replace(self, Empty) {
                Cons(hd, rlln) => {
                    let mut newll = (*rlln).clone();
                    swap(self, &mut newll); // self now contains tail, newll contains the Empty
                    Some(hd)
                }
                _ => None,
            }
        }
    }

    pub struct Iter<'a, T: 'a + Send + Sync + Clone>(*const LazyList<'a, T>);

    impl<'a, T: 'a + Send + Sync + Clone> Iterator for Iter<'a, T> {
        type Item = &'a T;

        fn next(&mut self) -> Option<Self::Item> {
            unsafe {
                if let LazyList::Cons(ref v, ref r) = *self.0 {
                    self.0 = &***r;
                    Some(v)
                } else {
                    None
                }
            }
        }
    }

    impl<'i, 'l, T: 'i + Send + Sync + Clone> IntoIterator for &'l LazyList<'i, T> {
        type Item = &'i T;
        type IntoIter = Iter<'i, T>;

        fn into_iter(self) -> Self::IntoIter {
            self.iter()
        }
    }

/*
    impl<'a, T: 'a + Send + Sync + Clone> FromIterator<T> for LazyList<'a, T> {
        fn from_iter<I: IntoIterator<Item = T> + 'a>(itrbl: I) -> LazyList<'a, T> {
            let itr = itrbl.into_iter();
            #[inline(always)]
            fn next_iter<'b, R: 'b + Send + Sync, Itr>(mut iter: Itr) -> LazyList<'b, R>
                where R: 'b + Clone,
                      Itr: 'b + Iterator<Item = R>
            {
                match iter.next() {
                    Some(val) => LazyList::cons(val, move || next_iter(iter)),
                    None => Empty,
                }
            }
            next_iter(itr)
        }
    }
*/

}

use self::lazylist::LazyList;
//use self::lazylist_sync::LazyList; // for slower thread-safe version

fn main() {
    fn fib<'a>() -> LazyList<'a, u64> {
        fn fibi<'b>(f: u64, s: u64) -> LazyList<'b, u64> {
            LazyList::cons(f, move || { let n = &f + &s; fibi(s, n) })
        }
        fibi(0, 1)
    }
    let test1 = fib();
    for v in test1.take(20) {
        print!("{} ", v);
    }
    println!("");
    let test2 = LazyList::from_iter(0..);
    for i in (&test2).into_iter().take(15) {
        print!("{} ", i)
    } // and from_iter() works
    println!("");
}

with the output as follows:

0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Note that although this shows that functional programming style with LazyList is possible in Rust, it doesn't mean that it should be the preferred style for all use cases, especially if high performance is desired. For instance, if the above fib() function were written to output an iterator directly rather than a LazyList, then it would only take a very few CPU clock cycles per iteration (other than if infinite precision BigUint were used, which are slower) rather than the hundreds of cycles needed per iteration for the LazyList (and many more for the "sync" version).

In general, more imperative implementations, perhaps using Vec<T> if memoization is required, are much more performant than this due to the high overhead of reference counting, the many small allocations/deallocations and the clone/copying required for functional style programming.

like image 41
GordonBGood Avatar answered Feb 18 '23 03:02

GordonBGood