Consider the following code (Playground version):
use std::cell::Cell;
struct Foo(u32);
#[derive(Clone, Copy)]
struct FooRef<'a>(&'a Foo);
// the body of these functions don't matter
fn testa<'a>(x: &FooRef<'a>, y: &'a Foo) { x; }
fn testa_mut<'a>(x: &mut FooRef<'a>, y: &'a Foo) { *x = FooRef(y); }
fn testb<'a>(x: &Cell<FooRef<'a>>, y: &'a Foo) { x.set(FooRef(y)); }
fn main() {
let u1 = Foo(3);
let u2 = Foo(5);
let mut a = FooRef(&u1);
let b = Cell::new(FooRef(&u1));
// try one of the following 3 statements
testa(&a, &u2); // allow move at (1)
testa_mut(&mut a, &u2); // deny move -- fine!
testb(&b, &u2); // deny move -- but how does rustc know?
u2; // (1) move out
// ... do something with a or b
}
I'm curious how rustc
knows that Cell
has interior mutability and may hold on to a reference of the other argument.
If I create another data structure from scratch, similar to Cell
which also has interior mutability, how do I tell rustc
that?
The reason the code with Cell
compiles (ignoring the u2
) and mutates is Cell
's whole API takes &
pointers:
impl<T> Cell<T> where T: Copy {
fn new(value: T) -> Cell<T> { ... }
fn get(&self) -> T { ... }
fn set(&self, value: T) { ... }
}
It is carefully written to allow mutation while shared, i.e. interior mutability. This allows it to expose these mutating methods behind a &
pointer. Conventional mutation requires a &mut
pointer (with its associated non-aliasing restrictions) because having unique access to a value is the only way to ensure that mutating it will be safe, in general.
So, the way to create types that allow mutation while shared is to ensure that their API for mutation uses &
pointers instead of &mut
. Generally speaking this should be done by having the type contain pre-written types like Cell
, i.e. use them as building blocks.
The reason later use of u2
fails is a longer story...
UnsafeCell
At a lower level, mutating a value while it is shared (e.g. has multiple &
pointers to it) is undefined behaviour, except for when the value is contained in an UnsafeCell
. This is the very lowest level of interior mutability, designed to be used as a building block for building other abstractions.
Types that allow safe interior mutability, like Cell
, RefCell
(for sequential code), the Atomic*
s, Mutex
and RwLock
(for concurrent code) all use UnsafeCell
internally and impose some restrictions around it to ensure that it is safe. For example, the definition of Cell
is:
pub struct Cell<T> {
value: UnsafeCell<T>,
}
Cell
ensures that mutations are safe by carefully restricting the API it offers: the T: Copy
in the code above is key.
(If you wish to write your own low-level type with interior mutability, you just need to ensure that the things that are mutated while being shared are contained in an UnsafeCell
. However, I recommended not doing this: Rust has several existing tools (the ones I mentioned above) for interior mutability that are carefully vetted to be safe and correct within Rust's aliasing and mutation rules; breaking the rules is undefined behaviour and can easily result in miscompiled programs.)
Anyway, the key that makes the compiler understand that the &u2
is borrowed for the cell case is variance of lifetimes. Typically, the compiler will shorten lifetimes when you pass things to functions, which makes things work great, e.g. you can pass a string literal (&'static str
) to a function expecting &'a str
, because the long 'static
lifetime is shortened to 'a
. This is happening for testa
: the testa(&a, &u2)
call is shortening the lifetimes of the references from the longest they could possibly be (the whole of the body of main
) to just that function call. The compiler is free to do this because normal references are variant1 in their lifetimes, i.e. it can vary them.
However, for testa_mut
, the &mut FooRef<'a>
stops the compiler being able to shorten that lifetime (in technical terms &mut T
is "invariant in T
"), exactly because something like testa_mut
can happen. In this case, the compiler sees the &mut FooRef<'a>
and understand that the 'a
lifetime can't be shorted at all, and so in the call testa_mut(&mut a, &u2)
it has to take the true lifetime of the u2
value (the whole function) and hence causes u2
to be borrowed for that region.
So, coming back to interior mutability: UnsafeCell<T>
not only tells the compiler that a thing may be mutated while aliased (and hence inhibits some optimisations that would be undefined), it is also invariant in T
, i.e. it acts like a &mut T
for the purposes of this lifetime/borrowing analysis, exactly because it allows code like testb
.
The compiler infers this variance automatically; it becomes invariant when some type parameter/lifetime is contained in UnsafeCell
or &mut
somewhere in the type (like FooRef
in Cell<FooRef<'a>>
).
The Rustonomicon talks about this and other detailed considerations like it.
1 Strictly speaking, there's four levels of variance in type system jargon: bivariance, covariance, contravariance and invariance. I believe Rust really only has invariance and covariance (there is some contravariance, but it caused problems and is removed/in the process of being removed). When I say "variant" it really means "covariant". See the Rustonomicon link above for more detail.
The relevant part from the Rust source code is this:
#[lang = "unsafe_cell"]
pub struct UnsafeCell<T: ?Sized> {
value: T,
}
Specifically, the #[lang = "unsafe_cell"]
is what tells the compiler that this particular type maps to its internal notion of "the interior mutability type". This sort of thing is called a "lang item".
You cannot define your own type for this purpose, as you can't have multiple instances of a single lang item. The only way you could was if you completely replaced the standard library with your own code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With