Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use PhantomData in a struct, with raw pointers, such that the struct does not outlive the lifetime of the referenced other struct?

Tags:

rust

I have struct that has unsafe code and raw mutable pointers to another type of struct. The unsafe struct should only be used during the lifetime of the other struct but you can not specify a lifetime for pointers. I discovered std::marker::PhantomData could be used for this unused lifetime issues but I am having issues getting it to work. I'm not sure if this is an invalid use case or I'm doing something wrong.

Simplified Example:

use std::marker::PhantomData;

pub struct Test {
    value: u32,
}

impl Test {
    pub fn value(&self) {
        println!("{}", self.value)
    }

    pub fn set_value(&mut self, value: u32) {
        self.value = value;
    }
}

// I want compiler to complain about the lifetime of test
// so that UnsafeStruct is not used after test is dropped
pub struct UnsafeStruct<'a> {
    test: *mut Test,
    phantom: PhantomData<&'a mut Test>,
}

impl<'a> UnsafeStruct<'a> {
    pub fn new(test: &'a mut Test) -> UnsafeStruct<'a> {
        UnsafeStruct {
            test: test,
            phantom: PhantomData,
        }
    }

    pub fn test_value(&self) {
        unsafe { println!("{}", (*self.test).value) }
    }

    pub fn set_test_value(&mut self, value: u32) {
        unsafe {
            (*self.test).set_value(value);
        }
    }
}

fn main() {
    // No borrow checker errors
    // but the compiler does not complain about lifetime of test
    let mut unsafe_struct: UnsafeStruct;
    {
        let mut test = Test { value: 0 };
        unsafe_struct = UnsafeStruct {
            test: &mut test,
            phantom: PhantomData,
        };

        unsafe_struct.set_test_value(1);
        test.value();

        test.set_value(2);
        unsafe_struct.test_value();
    }
    unsafe_struct.set_test_value(3);
    unsafe_struct.test_value();

    // Lifetime errors caught
    // but there will be borrow checker errors if you fix
    let mut unsafe_struct: UnsafeStruct;
    {
        let mut test = Test { value: 0 };
        unsafe_struct = UnsafeStruct::new(&mut test);

        unsafe_struct.set_test_value(1);
        test.value();

        test.set_value(2);
        unsafe_struct.test_value();
    }
    unsafe_struct.set_test_value(3);
    unsafe_struct.test_value();

    // Borrow checker errors when you fix lifetime error
    {
        let mut test = Test { value: 0 };
        let mut unsafe_struct: UnsafeStruct;
        unsafe_struct = UnsafeStruct::new(&mut test);

        unsafe_struct.set_test_value(1);
        test.value();

        test.set_value(2);
        unsafe_struct.test_value();
    }
}

If I create the UnsafeStruct directly the compiler does not catch the lifetime errors and I would like to use a constructor function anyway. If I use the constructor function then I have borrow checker errors. Is it possible to fix this code, such that the compiler will error when attempting to use a UnsafeStruct outside of the lifetime of the corresponding Test, but will not have the borrow checking errors shown in the example?

like image 249
Jake Avatar asked Sep 01 '25 02:09

Jake


2 Answers

I am answering my own question. The problem I was trying to solve was using std::marker::PhantomData to achieve adding lifetimes to a struct with raw pointers to prevent use after free errors. You can not achieve this with PhantomData. There is a use case for handling unhandled lifetimes, but that is different than what I was trying accomplish, and was the source of my confusion / question.

I was already aware and have handled the fact that you must handle use after free and other errors when using unsafe code. I just thought I might be able to handle this type of use after free error at compile time instead of runtime.

like image 128
Jake Avatar answered Sep 02 '25 16:09

Jake


TL;DR What you're doing violates the exclusivity requirement of mutable references, but you can use shared references and internal mutability to make an API that works.

A &mut T reference represents exclusive access to a T. When you borrow an object with &mut, that object must not be accessed (mutably or immutably), through any other reference, for the lifetime of the &mut borrow. In this example:

let mut test = Test { value: 0 };
let mut unsafe_struct: UnsafeStruct;
unsafe_struct = UnsafeStruct::new(&mut test);

unsafe_struct.set_test_value(1);
test.value();

test.set_value(2);
unsafe_struct.test_value();

unsafe_struct keeps the &mut borrow of test alive. It doesn't matter that internally it contains a raw pointer; it could contain nothing. The 'a in UnsafeStruct<'a> extends the lifetime of the borrow, making it undefined behavior to access test directly, until after unsafe_struct is used for the last time.

The example suggests that you actually want shared access to a resource (that is, shared between test and unsafe_struct). Rust has a shared reference type; it's &T. If you want the original T to still be accessible while a borrow is live, that borrow has to be shared (&), not exclusive (&mut).

How do you mutate something if all you have is a shared reference? Using internal mutability.

use std::cell::Cell;

pub struct Test {
    value: Cell<u32>,
}

impl Test {
    pub fn value(&self) {
        println!("{}", self.value.get())
    }

    pub fn set_value(&self, value: u32) {
        self.value.set(value);
    }
}

pub struct SafeStruct<'a> {
    test: &'a Test,
}

impl<'a> SafeStruct<'a> {
    pub fn new(test: &'a Test) -> SafeStruct<'a> {
        SafeStruct { test }
    }

    pub fn test_value(&self) {
        println!("{}", self.test.value.get())
    }

    pub fn set_test_value(&self, value: u32) {
        self.test.set_value(value);
    }
}

There's no unsafe code left -- Cell is a safe abstraction. You could also use AtomicU32 instead of Cell<u32>, for thread-safety, or if the real content of Test is more complicated, RefCell, RwLock, or Mutex. These are all abstractions that provide shared ("internal") mutability, but they differ in usage. Read the documentation and the links below for more detail.

As a final resort, if you need shared mutable access to an object with no overhead, and take full responsibility for guaranteeing its correctness on your own shoulders, you can use UnsafeCell. This does require using unsafe code, but you can write any API you want. Note that all the safe abstractions I just mentioned are built using UnsafeCell internally. You cannot have shared mutability without it.

Links

  • How does the Rust compiler know `Cell` has internal mutability?
  • RefCell<T> and the Interior Mutability Pattern (from the official book)
like image 33
trent Avatar answered Sep 02 '25 17:09

trent