I'm in a situation where I'm working with data wrapped in an Arc
, and I sometimes end up using into_raw
to get the raw pointer to the underlying data. My use case also calls for type-erasure, so the raw pointer often gets cast to a *const c_void
, then cast back to the appropriate concrete type when re-constructing the Arc
.
I've run into a situation where it would be useful to be able to clone the Arc
without needing to know the concrete type of the underlying data. As I understand it, it should be safe to reconstruct the Arc
with a dummy type solely for the purpose of calling clone
, so long as I never actually dereference the data. So, for example, this should be safe:
pub unsafe fn clone_raw(handle: *const c_void) -> *const c_void {
let original = Arc::from_raw(handle);
let copy = original.clone();
mem::forget(original);
Arc::into_raw(copy)
}
Is there anything that I'm missing that would make this actually unsafe? Also, I assume the answer would apply to Rc
as well, but if there are any differences please let me know!
This is almost always unsafe.
An Arc<T>
is just a pointer to a heap-allocated struct which roughly looks like
struct ArcInner<T: ?Sized> {
strong: atomic::AtomicUsize,
weak: atomic::AtomicUsize,
data: T, // You get a raw pointer to this element
}
into_raw()
gives you a pointer to the data
element. The implementation of Arc::from_raw()
takes such a pointer, assumes that it's a pointer to the data
-element in an ArcInner<T>
, walks back in memory and assumes to find an ArcInner<T>
there. This assumption depends on the memory-layout of T
, specifically it's alignment and therefore it's exact placement in ArcInner
.
If you call into_raw()
on an Arc<U>
and then call from_raw()
as if it was an Arc<V>
where U
and V
differ in alignment, the offset-calculation of where U
/V
is in ArcInner
will be wrong and the call to .clone()
will corrupt the data structure. Dereferencing T
is therefore not required to trigger memory unsafety.
In practice, this might not be a problem: Since data
is the third element after two usize
-elements, most T
will probably be aligned the same way. However, if the stdlib-implementation changes or you end up compiling for a platform where this assumption is wrong, reconstructing an Arc<V>::from_raw
that was created by an Arc<U>
where the memory layout of V
and U
is different will be unsafe and crash.
Update:
Having thought about it some more I downgrade my vote from "might be safe, but cringy" to "most likely unsafe" because I can always do
#[repr(align(32))]
struct Foo;
let foo = Arc::new(Foo);
In this example Foo
will be aligned to 32 bytes, making ArcInner<Foo>
32 bytes in size (8+8+16+0) while a ArcInner<()>
is just 16 bytes (8+8+0+0). Since there is no way to tell what the alignment of T
is after the type has been erased, there is no way to reconstruct a valid Arc
.
There is an escape hatch that might be safe in practice: By wrapping T
into another Box
, the layout of ArcInner<T>
is always the same. In order to force this upon any user, you can do something like
struct ArcBox<T>(Arc<Box<T>>)
and implement Deref
on that. Using ArcBox
instead of Arc
forces the memory layout of ArcInner
to always be the same, because T
is behind another pointer. This, however, means that all access to T
requires a double dereference, which might badly affect performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With