Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a cyclic reference with Arc and Weak?

I have two structs:

struct A { 
    map: HashMap<u32, Vec<B>>,
}

struct B {
    weak: Weak<A>
}

When A is constructed, it will own several B, each of which links back to the A which was just constructed, similar to this:

let a = Arc::new(A { map: HashMap::new() });

let b1 = B { weak: Arc::downgrade(&a) };
let b3 = B { weak: Arc::downgrade(&a) };
let b2 = B { weak: Arc::downgrade(&a) };

a.map.insert(5, vec![b1, b2]);
a.map.insert(10, vec![b3]);

Playground

This doesn't work since Arc does not provide a way to modify the map. Arc::get_mut does not work since a Weak is already constructed to the value.

How it is possible to construct an A with some B? I'm trying to avoid runtime checks when accessing map because after construction it will never be modified again. I have no problem using unsafe code or approved nightly features.

like image 584
Tim Diekmann Avatar asked May 19 '18 13:05

Tim Diekmann


2 Answers

I would approach this from the opposite direction, actually.

HashMap is a much more complicated type than Weak<T: Sized>, and therefore it is much easier to swap Weak after the fact. Therefore, my approach would be:

  1. Create the B with a dummy reference.
  2. Create the A, transfer ownership of the B.
  3. Iterate over the B, swapping out the A for the real one.

AFAIK, the standard library does not provide any way to (1) create null Weak and (2) atomically swap them out. Crossbeam has an example of ArcCell: simply search/replacing all Arc by Weak gives us a WeakCell!

With that WeakCell<T> at our disposal:

#[derive(Default)]
struct A { 
    map: HashMap<u32, Vec<B>>,
}

struct B {
    weak: WeakCell<A>,
}

impl A {
    pub fn new(map: HashMap<u32, Vec<B>>) -> Arc<A> {
        let a = Arc::new(A { map });
        let weak = Arc::downgrade(&a);
        for (_, bs) in &a.map {
            for b in bs {
                b.weak.set(weak.clone());
            }
        }
        a
    }
}

impl B {
    pub fn new(a: &Arc<A>) -> B { B { weak: WeakCell::new(Arc::downgrade(a)), } }
}

fn main() {
    let dummy = Arc::new(A::default());

    let (b1, b2, b3) = (B::new(&dummy), B::new(&dummy), B::new(&dummy));

    let mut map = HashMap::new();
    map.insert(5, vec![b1, b2]);
    map.insert(10, vec![b3]);

    let _a = A::new(map);

    //  Do something!
}

Which you can see in action on the playground.

It should be possible to modify WeakCell to construct it from 0 (guaranteeing it would be initialized later), thus avoiding the need for the dummy reference. That's an exercise left to the reader ;)

like image 20
Matthieu M. Avatar answered Oct 16 '22 13:10

Matthieu M.


Arc::get_mut() will fail if you have even existing Weak references so you need to consider using interior mutability instead. Since you are using Arc, I'll assume you are in a multi-threaded environment, so I'll use a RwLock which is thread-safe.

use std::sync::{Arc, Weak, RwLock};
use std::collections::HashMap;

struct A { 
    map: RwLock<HashMap<u32, Vec<B>>>,
}

struct B {
    weak: Weak<A>
}

Now you can construct these objects something like this:

fn init_a(a: Arc<A>) -> Arc<A> {
    let b1 = B { weak: Arc::downgrade(&a) };
    let b2 = B { weak: Arc::downgrade(&a) };
    // extra block is required so that the Mutex's write lock is dropped 
    // before we return a
    {
        let mut map = a.map.write().unwrap();
        let vec = map.entry(0).or_insert(Vec::new());
        vec.push(b1);
        vec.push(b2);
    }
    a
}

fn main() {
    let mut a = Arc::new(A { map: RwLock::new(HashMap::new()) });
    a = init_a(a);
}

If you really want to get rid of all runtime overhead of the Mutex, and you don't mind using unsafe code, you could use an UnsafeCell. It has zero overhead, but its interface requires an unsafe block and it's an extra layer of unwrapping in your code. Also UnsafeCell is not Sync, so you couldn't share it between threads.

To solve those problems, by making sure that you only need to consider UnsafeCell during construction, you can can take advantage of the fact that UnsafeCell has zero size cost and does not affect layout. Instead of A, Use a different type for constructing, which is identical to A apart from the UnsafeCell. These types can then be used interchangeably with mem::transmute.

use std::collections::HashMap;
use std::sync::{Arc, Weak};
use std::cell::UnsafeCell;
use std::mem;

struct A { 
    map: HashMap<u32, Vec<B>>,
}

struct B {
    weak: Weak<A>
}

impl A {
    fn new() -> Arc<A> {
        let a = A { map: HashMap:: new() };
        Self::init_a(Arc::new(a))
    }

    fn init_a(a: Arc<A>) -> Arc<A> {
        // Important: The layout is identical to A
        struct AConstruct {
            map: UnsafeCell<HashMap<u32, Vec<B>>>,
        }
        // Treat the object as if was an AConstruct instead
        let a: Arc<AConstruct> = unsafe { mem::transmute(a) };
        let map = unsafe { &mut *a.map.get() };
        // B's weak references are to Arc<A> not to Arc<AConstruct> 
        let weak_a: Weak<A> = unsafe { mem::transmute(Arc::downgrade(&a)) };

        // Actual initialization here
        let vec = map.entry(0).or_insert(Vec::new());
        let b1 = B { weak: weak_a.clone() };
        let b2 = B { weak: weak_a.clone() };
        vec.push(b1);
        vec.push(b2);

        // We're done. Pretend the UnsafeCells never existed
        unsafe { mem::transmute(a) }
    }
}

You can also do this with raw pointers, but I feel a little "safer" with UnsafeCell! LLVM does some optimisations when it's given guarantees that certain data is immutable, and UnsafeCell does some magic to protect you when it breaks those guarantees. So I'm not 100% certain of the safety of doing this:

fn init_a(a: Arc<A>) -> Arc<A> {
    // Raw IMMUTABLE pointer to the HashMap 
    let ptr = &a.map as *const HashMap<_, _>;
    // Unsafely coerce it to MUTABLE
    let map: &mut HashMap<_, _> = unsafe { mem::transmute(ptr) };
    let weak_a: Weak<A> = Arc::downgrade(&a);

    // Actual initialization here
    let vec = map.entry(0).or_insert(Vec::new());
    let b1 = B { weak: weak_a.clone() };
    let b2 = B { weak: weak_a.clone() };
    vec.push(b1);
    vec.push(b2);

    a
}
like image 138
Peter Hall Avatar answered Oct 16 '22 15:10

Peter Hall