I have collections dumped on disk. When requested, these collections should be retrieved (no problem) and an iterator
should be build for it that returns references to the retrieved values.
After the iterator
is dropped, I do not need the collection anymore. I want it to be dropped, too.
What I have tried so far:
The Iterator
owns the collection. This made the most sense for me, but it is not possible; I am not quite sure why. Some say the Iterator
traits' method signature for next
is the problem. (example)
Reference Counting: The Retriever
returns a Rc<Vec<usize>>
. I ran into the same problems as in the owning iterator. (example)
Letting the retriever own the collection and handing out a reference to it. I tried to implement the retriever with interior mutability (RefCell<HashMap>
), but I cannot return references into the HashMap
with long enough lifetimes.
I see two basic possibilities with this.
The retriever transfers ownership. Then the Iterator
would need to own the data. Something in the lines of:
use std::slice::Iter;
fn retrieve(id: usize) -> Vec<usize> {
//Create Data out of the blue (or disk, or memory, or network. I dont care)
//Move the data out. Transfer ownership
let data = vec![0, 1, 2, 3];
data
}
fn consume_iterator<'a, TIterator: Iterator<Item=&'a usize>>(iterator: TIterator) {
for i in iterator {
println!("{}", i);
}
}
fn handler<'a>(id: usize) -> Iter<'a, usize> {
//handle_request now owns the vector.
//I now want to build an owning iterator..
//This does of course not compile as vector will be dropped at the end of this method
retrieve(id).iter()
}
fn main() {
consume_iterator(handler(0))
}
The retriever owns the collection. But then two new problems arise:
use std::cell::{Ref, RefCell};
struct Retriever {
//Own the data. But I want it to be dropped as soon as the references to it go out of scope.
data: RefCell<Vec<usize>>
}
impl Retriever{
fn retrieve<'a>(&'a self, id: usize) -> Ref<'a, Vec<usize>> {
//Create Data out of the blue (or disk, or memory, or network. I dont care)
//Now data can be stored internally and a referece to it can be supplied.
let mut data = self.data.borrow_mut();
*data = vec![0, 1, 2, 3];
self.data.borrow()
}
}
fn consume_iterator<'a, TIterator: Iterator<Item=&'a usize>>(iterator: TIterator) {
for i in iterator {
println!("{}", i);
}
}
fn handler<'a>(ret: &'a Retriever, id: usize) -> IterWrapper<'a> {
//andle_request now has a reference to the collection
//So just call iter()? Nope. Lifetime issues.
ret.retrieve(id).iter()
}
fn main() {
let retriever = Retriever{data: RefCell::new(Vec::new())};
consume_iterator(handler(&retriever, 0))
}
I feel a bit lost here and am overlooking something obvious.
The Iterator owns the collection. [or joint ownership via reference-counting]
ContainerIterator { data: data, iter: data.iter(), }
No, you cannot have a value and a reference to that value in the same struct.
Letting the retriever own the collection and handing out a reference to it.
No, you cannot return references to items owned by the iterator.
As commenters have said, use IntoIter
to transfer ownership of the items to the iterator and then hand them out as the iterated values:
use std::vec::IntoIter;
struct ContainerIterator {
iter: IntoIter<usize>,
}
impl Iterator for ContainerIterator {
type Item = usize;
fn next(&mut self) -> Option<Self::Item> {
self.iter.next()
}
}
fn main() {
let data = vec![0, 1, 2, 3];
let cont = ContainerIterator { iter: data.into_iter() };
for x in cont {
println!("Hi {}", x)
}
}
If you must return references... then you need to keep the thing that owns them around for the entire time that all the references might be around.
How can I drop the data when the iterator is out of range?
By not using the value any more:
fn main() {
{
let loaded_from_disk = vec![0, 1, 2, 3];
for i in &loaded_from_disk {
println!("{}", i)
}
// loaded_from_disk goes out of scope and is dropped. Nothing to *do*, per se.
}
}
How do I tell the borrow-checker that I will own the collection long enough?
By owning the collection long enough. There's no secret handshake that the Rust Illuminati use with the borrow checker. The code only needs to be structured such that the thing that is borrowed doesn't become invalid while the borrow is outstanding. You can't move it (changing the memory address) or drop it (changing the memory address).
I was now finally able to implement a relatively statisfying solution:
Hiding the mutability of iterators inside Cell
s:
pub trait OwningIterator<'a> {
type Item;
fn next(&'a self) -> Option<Self::Item>;
}
A struct now needs a Cell
d position to allow iteration without mutation.
As an example here is the implementation of a struct that both owns and can iterate over a Arc<Vec<T>>
:
pub struct ArcIter<T> {
data: Arc<Vec<T>>,
pos: Cell<usize>,
}
impl<'a, T: 'a> OwningIterator<'a> for ArcIter<T> {
type Item = &'a T;
fn next(&'a self) -> Option<Self::Item> {
if self.pos.get() < self.data.len() {
self.pos.set(self.pos.get() + 1);
return Some(&self.data[self.pos.get() - 1]);
}
None
}
}
As I was able to hide these kind of iterators behind interfaces and let the user only handle "real" iterators I feel this is an acceptable deviation from the standard.
Thanks to everyone who contributed with ideas that ultimately helped me to find that solution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With