I have a set of objects that need to know each other to cooperate. These objects are stored in a container. I'm trying to get a very simplistic idea of how to architecture my code in Rust.
Let's use an analogy. A Computer
contains:
Mmu
Ram
Processor
In Rust:
struct Computer {
mmu: Mmu,
ram: Ram,
cpu: Cpu,
}
For anything to work, the Cpu
needs to know about the Mmu
it is linked to, and the Mmu
needs to know the Ram
it is linked to.
I do not want the Cpu
to aggregate by value the Mmu
. Their lifetimes differ: the Mmu
can live its own life by itself. It just happens that I can plug it to the Cpu
. However, there is no sense in creating a Cpu
without an Mmu
attached to it, since it would not be able to do its job. The same relation exists between Mmu
and Ram
.
Therefore:
Ram
can live by itself.Mmu
needs a Ram
.Cpu
needs an Mmu
.How can I model that kind of design in Rust, one with a struct whose fields know about each other.
In C++, it would be along the lines of:
>
struct Ram
{
};
struct Mmu
{
Ram& ram;
Mmu(Ram& r) : ram(r) {}
};
struct Cpu
{
Mmu& mmu;
Cpu(Mmu& m) : mmu(m) {}
};
struct Computer
{
Ram ram;
Mmu mmu;
Cpu cpu;
Computer() : ram(), mmu(ram), cpu(mmu) {}
};
Here is how I started translating that in Rust:
struct Ram;
struct Mmu<'a> {
ram: &'a Ram,
}
struct Cpu<'a> {
mmu: &'a Mmu<'a>,
}
impl Ram {
fn new() -> Ram {
Ram
}
}
impl<'a> Mmu<'a> {
fn new(ram: &'a Ram) -> Mmu<'a> {
Mmu {
ram: ram
}
}
}
impl<'a> Cpu<'a> {
fn new(mmu: &'a Mmu) -> Cpu<'a> {
Cpu {
mmu: mmu,
}
}
}
fn main() {
let ram = Ram::new();
let mmu = Mmu::new(&ram);
let cpu = Cpu::new(&mmu);
}
That is fine and all, but now I just can't find a way to create the Computer
struct.
I started with:
struct Computer<'a> {
ram: Ram,
mmu: Mmu<'a>,
cpu: Cpu<'a>,
}
impl<'a> Computer<'a> {
fn new() -> Computer<'a> {
// Cannot do that, since struct fields are not accessible from the initializer
Computer {
ram: Ram::new(),
mmu: Mmu::new(&ram),
cpu: Cpu::new(&mmu),
}
// Of course cannot do that, since local variables won't live long enough
let ram = Ram::new();
let mmu = Mmu::new(&ram);
let cpu = Cpu::new(&mmu);
Computer {
ram: ram,
mmu: mmu,
cpu: cpu,
}
}
}
Okay, whatever, I won't be able to find a way to reference structure fields between them. I thought I could come up with something by creating the Ram
, Mmu
and Cpu
on the heap; and put that inside the struct:
struct Computer<'a> {
ram: Box<Ram>,
mmu: Box<Mmu<'a>>,
cpu: Box<Cpu<'a>>,
}
impl<'a> Computer<'a> {
fn new() -> Computer<'a> {
let ram = Box::new(Ram::new());
// V-- ERROR: reference must be valid for the lifetime 'a
let mmu = Box::new(Mmu::new(&*ram));
let cpu = Box::new(Cpu::new(&*mmu));
Computer {
ram: ram,
mmu: mmu,
cpu: cpu,
}
}
}
Yeah that's right, at this point in time Rust has no way to know that I'm going to transfer ownership of let ram = Box::new(Ram::new())
to the Computer
, so it will get a lifetime of 'a
.
I've been trying various more or less hackish ways to get that right, but I just can't come up with a clean solution. The closest I've come is to drop the reference and use an Option
, but then all my methods have to check whether the Option
is Some
or None
, which is rather ugly.
I think I'm just on the wrong track here, trying to map what I would do in C++ in Rust, but that doesn't work. That's why I would need help finding out what is the idiomatic Rust way of creating this architecture.
In this answer I will discuss two approaches to solving this problem, one in safe Rust with zero dynamic allocation and very little runtime cost, but which can be constricting, and one with dynamic allocation that uses unsafe invariants.
Cell<Option<&'a T>
)use std::cell::Cell;
#[derive(Debug)]
struct Computer<'a> {
ram: Ram,
mmu: Mmu<'a>,
cpu: Cpu<'a>,
}
#[derive(Debug)]
struct Ram;
#[derive(Debug)]
struct Cpu<'a> {
mmu: Cell<Option<&'a Mmu<'a>>>,
}
#[derive(Debug)]
struct Mmu<'a> {
ram: Cell<Option<&'a Ram>>,
}
impl<'a> Computer<'a> {
fn new() -> Computer<'a> {
Computer {
ram: Ram,
cpu: Cpu {
mmu: Cell::new(None),
},
mmu: Mmu {
ram: Cell::new(None),
},
}
}
fn freeze(&'a self) {
self.mmu.ram.set(Some(&self.ram));
self.cpu.mmu.set(Some(&self.mmu));
}
}
fn main() {
let computer = Computer::new();
computer.freeze();
println!("{:?}, {:?}, {:?}", computer.ram, computer.mmu, computer.cpu);
}
Playground
Contrary to popular belief, self-references are in fact possible in safe Rust, and even better, when you use them Rust will continue to enforce memory safety for you.
The main "hack" needed to get self, recursive, or cyclical references using &'a T
is the use of a Cell<Option<&'a T>
to contain the reference. You won't be able to do this without the Cell<Option<T>>
wrapper.
The clever bit of this solution is splitting initial creation of the struct from proper initialization. This has the unfortunate downside that it's possible to use this struct incorrectly by initializing it and using it before calling freeze
, but it can't result in memory unsafety without further usage of unsafe
.
The initial creation of the struct only sets the stage for our later hackery - it creates the Ram
, which has no dependencies, and sets the Cpu
and Mmu
to their unusable state, containing Cell::new(None)
instead of the references they need.
Then, we call the freeze
method, which deliberately holds a borrow of self with lifetime 'a
, or the full lifetime of the struct. After we call this method, the compiler will prevent us from getting mutable references to the Computer
or moving the Computer
, as either could invalidate the reference that we are holding. The freeze
method then sets up the Cpu
and Mmu
appropriately by setting the Cell
s to contain Some(&self.cpu)
or Some(&self.ram)
respectively.
After freeze
is called our struct is ready to be used, but only immutably.
Box<T>
never moves T
)#![allow(dead_code)]
use std::mem;
// CRUCIAL INFO:
//
// In order for this scheme to be safe, Computer *must not*
// expose any functionality that allows setting the ram or
// mmu to a different Box with a different memory location.
//
// Care must also be taken to prevent aliasing of &mut references
// to mmu and ram. This is not a completely safe interface,
// and its use must be restricted.
struct Computer {
ram: Box<Ram>,
cpu: Cpu,
mmu: Box<Mmu>,
}
struct Ram;
// Cpu and Mmu are unsafe to use directly, and *must only*
// be exposed when properly set up inside a Computer
struct Cpu {
mmu: *mut Mmu,
}
struct Mmu {
ram: *mut Ram,
}
impl Cpu {
// Safe if we uphold the invariant that Cpu must be
// constructed in a Computer.
fn mmu(&self) -> &Mmu {
unsafe { mem::transmute(self.mmu) }
}
}
impl Mmu {
// Safe if we uphold the invariant that Mmu must be
// constructed in a Computer.
fn ram(&self) -> &Ram {
unsafe { mem::transmute(self.ram) }
}
}
impl Computer {
fn new() -> Computer {
let ram = Box::new(Ram);
let mmu = Box::new(Mmu {
ram: unsafe { mem::transmute(&*ram) },
});
let cpu = Cpu {
mmu: unsafe { mem::transmute(&*mmu) },
};
// Safe to move the components in here because all the
// references are references to data behind a Box, so the
// data will not move.
Computer {
ram: ram,
mmu: mmu,
cpu: cpu,
}
}
}
fn main() {}
Playground
NOTE: This solution is not completely safe given an unrestricted interface to Computer
- care must be taken to not allow aliasing or removal of the Mmu
or Ram
in the public interface of Computer.
This solution instead uses the invariant that data stored inside of a Box
will never move - it's address will never change - as long as the Box
remains alive. Rust doesn't allow you to depend on this in safe code, since moving a Box
can cause the memory behind it be deallocated, thereby leaving a dangling pointer, but we can rely on it in unsafe code.
The main trick in this solution is to use raw pointers into the contents of the Box<Mmu>
and Box<Ram>
to store references into them in the Cpu
and Mmu
respectively. This gets you a mostly safe interface, and doesn't prevent you from moving the Computer
around or even mutating it in restricted cases.
All of this said, I don't think either of these should really be the way you approach this problem. Ownership is a central concept in Rust, and it permeates the design choices of almost all code. If the Mmu
owns the Ram
and the Cpu
owns the Mmu
, that's the relationship you should have in your code. If you use Rc
, you can even maintain the ability to share the underlying pieces, albeit immutably.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With