Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dealing with problematic parent-child relationships enforced by C FFI

Tags:

rust

lifetime

ffi

I have a C library with an interface similar to this: (I have represented the C API within Rust, so that all of the code in this question can be concatenated in a single .rs file and easily tested)

// Opaque handles to C structs
struct c_A {}
struct c_B {}

// These 2 `create` functions allocate some heap memory and other 
// resources, so I have represented this using Boxes.
extern "C" fn create_a() -> *mut c_A {
    let a = Box::new(c_A {});
    Box::into_raw(a)
}

// This C FFI function frees some memory and other resources, 
// so I have emulated that here.
extern "C" fn destroy_a(a: *mut c_A) {
    let _a: Box<c_A> = unsafe { Box::from_raw(a) };
}

extern "C" fn create_b(_a: *mut c_A) -> *mut c_B {
    let b = Box::new(c_B {});
    Box::into_raw(b)
}

// Note: While unused here, the argument `_a` is actually used in 
// the C library, so I cannot remove it. (Also, I don't control 
// the C interface)
extern "C" fn destroy_b(_a: *mut c_A, b: *mut c_B) {
    let _b = unsafe { Box::from_raw(b) };
}

I have created the following Rusty abstraction over the C functions:

struct A {
    a_ptr: *mut c_A,
}

impl A {
    fn new() -> A {
        A { a_ptr: create_a() }
    }
}

impl Drop for A {
    fn drop(&mut self) {
        destroy_a(self.a_ptr);
    }
}

struct B<'a> {
    b_ptr: *mut c_B,
    a: &'a A,
}

impl<'a> B<'a> {
    fn new(a: &'a A) -> B {
        B {
            b_ptr: create_b(a.a_ptr),
            a,
        }
    }
}

impl<'a> Drop for B<'a> {
    fn drop(&mut self) {
        destroy_b(self.a.a_ptr, self.b_ptr);
    }
}

The B struct contains a reference to A for the sole reason that the a_ptr is necessary when calling the destroy_b function for memory cleanup. This reference is not needed by me for any of my Rust code.

I would now like to create the following struct which references both A and B:

struct C<'b> {
    a: A,
    b: B<'b>,
}

impl<'b> C<'b> {
    fn new() -> C<'b> {
        let a = A::new();
        let b = B::new(&a);
        C { a, b }
    }
}

// Main function just so it compiles
fn main() {
    let c = C::new();
}

However, this will not compile:

error[E0597]: `a` does not live long enough
  --> src/main.rs:76:25
   |
76 |         let b = B::new(&a);
   |                         ^ borrowed value does not live long enough
77 |         C { a, b }
78 |     }
   |     - borrowed value only lives until here
   |
note: borrowed value must be valid for the lifetime 'b as defined on the impl at 73:1...
  --> src/main.rs:73:1
   |
73 | impl<'b> C<'b> {
   | ^^^^^^^^^^^^^^

I understand why this fails: When returning the C struct from C::new(), it moves the C. This means that the A contained within is moved, which renders all references to it invalid. Therefore, there's no way I could create that C struct. (Explained in much more detail here)

How can I refactor this code in such a way that I can store a B in a struct along with its "parent" A? There's a few options I've thought of, that won't work:

  • Change the C interface: I don't control the C interface, so I can't change it.
  • Have B store a *mut c_A instead of &A: If A is dropped, then that raw pointer becomes invalid, and will result in undefined behavior when B is dropped.
  • Have B store an owned A rather than a reference &A: For my use case, I need to be able to create multiple Bs for each A. If B owns A, then each A can only be used to create one B.
  • Have A own all instances of B, and only return references to B when creating a new B: This has the problem that Bs will accumulate over time until the A is dropped, using up more memory than necessary. However, if this is indeed the best way to go about it, I can deal with the slight inconvenience.
  • Use the rental crate: I would rather take the slight memory usage hit than add the complexity of a new macro to my code. (That is, the complexity of anyone reading my code needing to learn how this macro works)

I suspect that the best solution somehow involves storing at least A on the heap so it doesn't need to move around, but I can't figure out how to make this work. Also, I wonder if there is something clever that I can do using raw pointers.

like image 884
ItsTimmy Avatar asked Jul 05 '18 20:07

ItsTimmy


Video Answer


1 Answers

This sounds like an ideal case for reference counting. Use Rc or Arc, depending on your multithreading needs:

use std::rc::Rc;

struct B {
    b_ptr: *mut c_B,
    a: Rc<A>,
}

impl B {
    fn new(a: Rc<A>) -> B {
        B {
            b_ptr: create_b(a.a_ptr),
            a,
        }
    }
}

impl Drop for B {
    fn drop(&mut self) {
        destroy_b(self.a.a_ptr, self.b_ptr);
    }
}

fn main() {
    let a = Rc::new(A::new());
    let x = B::new(a.clone());
    let y = B::new(a);
}
  • Does not change the C interface.
  • A cannot be dropped while there are still Bs referencing it.
  • Can create multiple Bs for each A.
  • A's memory usage will not increase forever.
  • Creates a single heap allocation to store A and its reference count.
  • Rc is in the standard library, no new crate to learn.

In the future, you'll be able to use arbitrary self types to write this in a nicer manner:

#![feature(arbitrary_self_types)]

use std::rc::Rc;

struct A {
    a_ptr: *mut c_A,
}

impl A {
    fn new() -> A {
        A { a_ptr: create_a() }
    }

    fn make_b(self: &Rc<Self>) -> B {
        B {
            b_ptr: create_b(self.a_ptr),
            a: self.clone(),
        }
    }
}

impl Drop for A {
    fn drop(&mut self) {
        destroy_a(self.a_ptr);
    }
}

struct B {
    b_ptr: *mut c_B,
    a: Rc<A>,
}

impl Drop for B {
    fn drop(&mut self) {
        destroy_b(self.a.a_ptr, self.b_ptr);
    }
}

fn main() {
    let a = Rc::new(A::new());
    let x = a.make_b();
    let y = a.make_b();
}
like image 140
Shepmaster Avatar answered Nov 16 '22 03:11

Shepmaster