I am interacting with a C library that takes a callback function as an argument and an integer.
void c_fun(void (*callback)(int32_t id, intptr_t userData), intptr_t userData)
This was translated to Rust via bindgen to
unsafe extern "C" fn c_fun(callback: Option<unsafe extern "C" fn(id: i32, userData: isize)>, userData: isize);
Looking at the examples, it is clear the expectation is to cast the integer userData to a pointer to a struct to be accessed by the callback function.
Example
void callback_fun(int32_t id, intptr_t userData) {
MyStruct *data = (MyStruct *)userData;
// use the data pointer below...
}
int main() {
MyStruct userData = {0};
c_fun(callback_fun, (intptr_t)(&userData));
}
The gist of my Rust code is something like this
#[derive(Debug, Default)]
struct MyStruct{}
unsafe extern "C" fn callback(id: i32, user_data: isize) {
let data: *mut MyStruct = std::ptr::with_exposed_provenance_mut(user_data as usize);
// do stuff with the pointer...
}
fn main() {
let mut data = MyStruct::default();
let ptr = (&mut data) as *mut MyStruct;
let addr = ptr.expose_provenance();
let addr = addr as isize; // match C signature
unsafe {c_fun(Some(callback), addr);} // c_fun comes from bindgen
}
The code currently compiles, runs, and it works as expected.
My questions are with regards to provenance and converting the integer to a pointer. Reading this section it is not clear to me the code will do the right thing:
"The compiler will do its best to pick the right provenance for you, but currently we cannot provide any guarantees about which provenance the resulting pointer will have. Only one thing is clear: if there is no previously ‘exposed’ provenance that justifies the way the returned pointer will be used, the program has undefined behavior.".
Even though the code is working as expected, I am trying to understand whether there is a potential undefined behaviour here and if there is a way to mitigate/eliminate it.
What you have here is the correct way.*
The documentation you are reading is primarily focused on well-defined methods to provide provenance for a integer-to-pointer casts - under the name "strict provenance" - but those crucially rely on an existing pointer to derive the provenance from. So that is not helpful here.
The introduction of strict provenance also provided documentation to describe how integer-to-pointer casts are valid otherwise - under the name "exposed provenance" - which you are using. The exact semantics are still unclear, but under the current wording the expose_provenance makes the with_exposed_provenance_mut later valid.
It should also be understood that while "strict provenance" is a new concept, "exposed provenance" is really just documenting the existing behavior of the compiler in an abstract, non-committal way. The above calls are also equivalent to integer-pointer as casts. So we've been operating under this model the whole time, this is just a (still incomplete) specification for how such casts can currently be modeled.
So to my understanding, this really only prevents you from doing things like having Rust cast an integer to pointer that ultimately refers to a Rust object that was not "exposed" - meaning you got it via transmutation, raw memory snooping, out-of-bounds offsets, or other shenanigans. But your scenario is following the model to the letter.
If you are interested in the motivations behind this, I encourage you to read Ralf Jung's Pointers are Complicated series (part 1, part 2, part 3). There is also the original tracking issue for strict provenance.
* It is correct as of now. Unfortunately, it is hard to say anything definitive especially since even the official wording calls exposed provenance "unclear" and internal discussions I've read are unsurprisingly also unclear. Some concerns voiced on the topic makes it sound like drastic changes need to be made to pointer-integer interactions, however it is hard to tell how much of that will surface into tangible changes if at all. But your code is well within what is documented so I personally find it unlikely to change.
I would normally suggest using miri to check your unsafe code, but it currently doesn't track exposed provenance and therefore can't report undefined behavior from misusing it.
It would be better if the API actually used pointers since that would avoid all this headache.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With