Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing to a field in a MaybeUninit structure?

I'm doing something with MaybeUninit and FFI in Rust that seems to work, but I suspect may be unsound/relying on undefined behavior.

My aim is to have a struct MoreA extend a struct A, by including A as an initial field. And then to call some C code that writes to the struct A. And then finalize MoreA by filling in its additional fields, based on what's in A.

In my application, the additional fields of MoreA are all integers, so I don't have to worry about assignments to them dropping the (uninitialized) previous values.

Here's a minimal example:

use core::fmt::Debug;
use std::mem::MaybeUninit;

#[derive(Clone, Copy, PartialEq, Debug)]
#[repr(C)]
struct A(i32, i32);

#[derive(Clone, Copy, PartialEq, Debug)]
#[repr(C)]
struct MoreA {
    head: A,
    more: i32,
}

unsafe fn mock_ffi(p: *mut A) {
    // write doesn't drop previous (uninitialized) occupant of p
    p.write(A(1, 2));
}

fn main() {
    let mut b = MaybeUninit::<MoreA>::uninit();
    unsafe { mock_ffi(b.as_mut_ptr().cast()); }
    let b = unsafe {
        let mut b = b.assume_init();
        b.more = 3;
        b
    };
    assert_eq!(&b, &MoreA { head: A(1, 2), more: 3 });
}

Is the code let b = unsafe { ... } sound? It runs Ok and Miri doesn't complain.

But the MaybeUninit docs say:

Moreover, uninitialized memory is special in that the compiler knows that it does not have a fixed value. This makes it undefined behavior to have uninitialized data in a variable even if that variable has an integer type, which otherwise can hold any fixed bit pattern.

Also, the Rust book says that Behavior considered undefined includes:

  • Producing an invalid value, even in private fields and locals. "Producing" a value happens any time a value is assigned to or read from a place, passed to a function/primitive operation or returned from a function/primitive operation. The following values are invalid (at their respective type):

    ... An integer (i*/u*) or ... obtained from uninitialized memory.

On the other hand, it doesn't seem possible to write to the more field before calling assume_init. Later on the same page:

There is currently no supported way to create a raw pointer or reference to a field of a struct inside MaybeUninit. That means it is not possible to create a struct by calling MaybeUninit::uninit::() and then writing to its fields.

If what I'm doing in the above code example does trigger undefined behavior, what would solutions be?

  1. I'd like to avoid boxing the A value (that is, I'd like to have it be directly included in MoreA).

  2. I'd hope also to avoid having to create one A to pass to mock_ffi and then having to copy the results into MoreA. A in my real application is a large structure.

I guess if there's no sound way to get what I'm after, though, I'd have to choose one of those two fallbacks.

If struct A is of a type that can hold the bit-pattern 0 as a valid value, then I guess a third fallback would be:

  1. Start with MaybeUninit::zeroed() rather than MaybeUninit::uninit().
like image 979
dubiousjim Avatar asked Apr 20 '20 08:04

dubiousjim


2 Answers

Currently, the only sound way to refer to uninitialized memory—of any type—is MaybeUninit. In practice, it is probably safe to read or write to uninitialized integers, but that is not officially documented. It is definitely not safe to read or write to an uninitialized bool or most other types.

In general, as the documentation states, you cannot initialize a struct field by field. However, it is sound to do so as long as:

  1. the struct has repr(C). This is necessary because it prevents Rust from doing clever layout tricks, so that the layout of a field of type MaybeUninit<T> remains identical to the layout of a field of type T, regardless of its adjacent fields.
  2. every field is MaybeUninit. This lets us assume_init() for the entire struct, and then later initialise each field individually.

Given that your struct is already repr(C), you can use an intermediate representation which uses MaybeIninit for every field. The repr(C) also means that we can transmute between the types once it is initialised, provided that the two structs have the same fields in the same order.

use std::mem::{self, MaybeUninit};

#[repr(C)]
struct MoreAConstruct {
    head: MaybeUninit<A>,
    more: MaybeUninit<i32>,
}

let b: MoreA = unsafe {
    // It's OK to assume a struct is initialized when all of its fields are MaybeUninit
    let mut b_construct = MaybeUninit::<MoreAConstruct>::uninit().assume_init();
    mock_ffi(b_construct.head.as_mut_ptr());
    b_construct.more = MaybeUninit::new(3);
    mem::transmute(b_construct)
};
like image 108
Peter Hall Avatar answered Nov 15 '22 07:11

Peter Hall


It is now possible (since Rust 1.51) to initialize fields of any uninitialized struct using the std::ptr::addr_of_mut macro. This example is from the documentation:

You can use MaybeUninit, and the std::ptr::addr_of_mut macro, to initialize structs field by field:


#[derive(Debug, PartialEq)] pub struct Foo {
    name: String,
    list: Vec<u8>, }

let foo = {
    let mut uninit: MaybeUninit<Foo> = MaybeUninit::uninit();
    let ptr = uninit.as_mut_ptr();

    // Initializing the `name` field
    unsafe { addr_of_mut!((*ptr).name).write("Bob".to_string()); }

    // Initializing the `list` field
    // If there is a panic here, then the `String` in the `name` field leaks.
    unsafe { addr_of_mut!((*ptr).list).write(vec![0, 1, 2]); }

    // All the fields are initialized, so we call `assume_init` to get an initialized Foo.
    unsafe { uninit.assume_init() } };

assert_eq!(
    foo,
    Foo {
        name: "Bob".to_string(),
        list: vec![0, 1, 2]
    } 
); 
like image 26
Ekrem Dinçel Avatar answered Nov 15 '22 08:11

Ekrem Dinçel