tl;dr What is the best "Rust way" to create some byte storage, in this case a Vec<u8>
, store that Vec<u8>
in struct
field that can be accessed with a key value (like a BTreeMap<usize, &Vec<u8>>
), and later read those Vec<u8>
from some other struct
s?
Can this be extrapolated to a general good rust design for similar struct
s that act as storage and cache for blobs of bytes (Vec<u8>
, [u8; 16384]
, etc.) accessible with a key (an usize
offset, a u32
index, a String
file path, etc.)?
I'm trying to create a byte storage struct
and impl
functions that:
Vec<u8>
of capacity 16384struct
will analyze the various Vec<u8>
and may need store their own references to those "blocks"Unfortunately, for each implementation attempt, I run into difficult problems of borrowing, lifetime ellision, mutability, copying, or other problems.
I created a struct BlockReader
that
Vec<u8>
(Vec<u8>::with_capacity(16384)
) typed as Block
File::seek
and File::take::read_to_end
) and stores 16384 of u8
into a Vec<u8>
Vec<u8>
within a BTreeMap
typed as Blocks
(playground code)
use std::io::Seek;
use std::io::SeekFrom;
use std::io::Read;
use std::fs::File;
use std::collections::BTreeMap;
type Block = Vec<u8>;
type Blocks<'a> = BTreeMap<usize, &'a Block>;
pub struct BlockReader<'a> {
blocks: Blocks<'a>,
file: File,
}
impl<'a> BlockReader<'a> {
/// read a "block" of 16384 `u8` at file offset
/// `offset` which is multiple of 16384
/// if the "block" at the `offset` is cached in
/// `self.blocks` then return a reference to that
/// XXX: assume `self.file` is already `open`ed file
/// handle
fn readblock(& mut self, offset: usize) -> Result<&Block, std::io::Error> {
// the data at this offset is the "cache"
// return reference to that
if self.blocks.contains_key(&offset) {
return Ok(&self.blocks[&offset]);
}
// have not read data at this offset so read
// the "block" of data from the file, store it,
// return a reference
let mut buffer = Block::with_capacity(16384);
self.file.seek(SeekFrom::Start(offset as u64))?;
self.file.read_to_end(&mut buffer);
self.blocks.insert(offset, & buffer);
Ok(&self.blocks[&offset])
}
}
There have been many problems with each implementation. For example, two calls to BlockReader.readblock
by a struct BlockAnalyzer1
have caused endless difficulties:
pub struct BlockAnalyzer1<'b> {
pub blockreader: BlockReader<'b>,
}
impl<'b> BlockAnalyzer1<'b> {
/// contrived example function
pub fn doStuff(&mut self) -> Result<bool, std::io::Error> {
let mut b: &Block;
match self.blockreader.readblock(3 * 16384) {
Ok(val) => {
b = val;
},
Err(err) => {
return Err(err);
}
}
match self.blockreader.readblock(5 * 16384) {
Ok(val) => {
b = val;
},
Err(err) => {
return Err(err);
}
}
Ok(true)
}
}
results in
error[E0597]: `buffer` does not live long enough
--> src/lib.rs:34:36
|
15 | impl<'a> BlockReader<'a> {
| -- lifetime `'a` defined here
...
34 | self.blocks.insert(offset, & buffer);
| ---------------------------^^^^^^^^-
| | |
| | borrowed value does not live long enough
| argument requires that `buffer` is borrowed for `'a`
35 | Ok(&self.blocks[&offset])
36 | }
| - `buffer` dropped here while still borrowed
However, I ran into many other errors for different permutations of this design, another error I ran into, for example
error[E0499]: cannot borrow `self.blockreader` as mutable more than once at a time
--> src/main.rs:543:23
|
463 | impl<'a> BlockUser1<'a> {
| ----------- lifetime `'a` defined here
...
505 | match self.blockreader.readblock(3 * 16384) {
| ---------------------------------------
| |
| first mutable borrow occurs here
| argument requires that `self.blockreader` is borrowed for `'a`
...
543 | match self.blockreader.readblock(5 * 16384) {
| ^^^^^^^^^^^^^^^^ second mutable borrow occurs here
In BlockReader
, I've tried permutations of "Block
" storage using Vec<u8>
, &Vec<u8>
, Box<Vec<u8>>
, Box<&Vec<u8>>
, &Box<&Vec<u8>>
, &Pin<&Box<&Vec<u8>>
, etc. However, each implementation permutation runs into various confounding problems with borrowing, lifetimes, and mutability.
Again, I'm not looking for the specific fix. I'm looking for a generally good rust-oriented design approach to this general problem: store a blob of bytes managed by some struct
, have other struct
get references (or pointers, etc.) to a blob of bytes, read that blob of bytes in loops (while possibly storing new blobs of bytes).
How would a rust expert approach this problem?
How should I store the Vec<u8>
(Block
) in BlockReader.blocks
, and also allow other Struct
to store their own references (or pointers, or references to pointers, or pinned Box pointers, or etc.) to a Block
?
Should the other struct
s copy or clone a Box<Block>
or a Pin<Box<Block>>
or something else?
Would using a different storage like a fixed sized array; type Block = [u8; 16384];
be easier to pass references for?
Should other Struct
like BlockUser1
be given &Block
, or Box<Block>
, or &Pin<&Box<&Block>
, or something else?
Again, each Vec<u8>
(Block
) is written once (during BlockReader.readblock
) and may be read many times by other Struct
s by calling BlockReader.readblock
and later by saving their own reference/pointer/etc. to that Block
(ideally, maybe that's not ideal?).
You can put the Vec<u8>
behind an Rc<RefCell<...>>
or simply a Rc<..>
if they're immutable.
If you need thread-safe access you'll need to use an Arc<Mutex<...>>
or Arc<RwLock<...>>
instead.
Here's a converted version of your code. (There were a few typos and bits that needed changing to get it to compile - you should really fix those in your example, and give us something that nearly compiles...) You can also see this in the playground
use std::io::Seek;
use std::io::SeekFrom;
use std::io::Read;
use std::fs::File;
use std::cell::RefCell;
use std::rc::Rc;
use std::collections::BTreeMap;
type Block = Vec<u8>;
type Blocks = BTreeMap<usize, Rc<RefCell<Block>>>;
pub struct BlockReader {
blocks: Blocks,
file: File,
}
impl BlockReader {
/// read a "block" of 16384 `u8` at file offset
/// `offset` which is multiple of 16384
/// if the "block" at the `offset` is cached in
/// `self.blocks` then return a reference to that
/// XXX: assume `self.file` is already `open`ed file
/// handle
fn readblock(& mut self, offset: usize) -> Result<Rc<RefCell<Block>>,std::io::Error> {
// the data at this offset is the "cache"
// return reference to that
if self.blocks.contains_key(&offset) {
return Ok(self.blocks[&offset].clone());
}
// have not read data at this offset so read
// the "block" of data from the file, store it,
// return a reference
let mut buffer = Block::with_capacity(16384);
self.file.seek(SeekFrom::Start(offset as u64))?;
self.file.read_to_end(&mut buffer);
self.blocks.insert(offset, Rc::new(RefCell::new(buffer)));
Ok(self.blocks[&offset].clone())
}
}
pub struct BlockAnalyzer1 {
pub blockreader: BlockReader,
}
impl BlockAnalyzer1 {
/// contrived example function
pub fn doStuff(&mut self) -> Result<bool,std::io::Error> {
let mut b: Rc<RefCell<Block>>;
match self.blockreader.readblock(3 * 16384) {
Ok(val) => {
b = val;
},
Err(err) => {
return Err(err);
}
}
match self.blockreader.readblock(5 * 16384) {
Ok(val) => {
b = val;
},
Err(err) => {
return Err(err);
}
}
Ok(true)
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With