While raw pointers in Rust have the offset
method, this only increments by the size of the pointer. How can I get access to the pointer in bytes?
Something like this in C:
var_offset = (typeof(var))((char *)(var) + offset);
In computer science, offset describes the location of a piece of data compared to another location. For example, when a program is accessing an array of bytes, the fifth byte is offset by four bytes from the array's beginning.
Yes, technically, there would be four addressable bytes for the int you describe. But the pointer points to the first byte, and reading an int from it reads that byte and the subsequent three bytes to construct the int value.
As we already know, the size of the pointer in C is dependent only on the word size of a particular system. So, the size of a pointer to a pointer should have the usual values, that is, 2 bytes for a 16-bit machine, 4 bytes for a 32-bit machine, and 8 bytes for a 64-bit machine.
Size of a pointer is fixed for a compiler. All pointer types take same number of bytes for a compiler. That is why we get 4 for both ptri and ptrc.
TL;DR: This answer invokes Undefined Behavior, according to RFC-2582.
In particular, references must be aligned and dereferencable, even when they are created and never used.
There are also discussions that field accesses themselves impose extra requirements not solved by the proposed &raw
, due to usage of getelementptr inbounds
, see offsetof
woes at the bottom of the RFC.
From the answer I linked to your previous question:
macro_rules! offset_of {
($ty:ty, $field:ident) => {
// Undefined Behavior: dereferences a null pointer.
// Undefined Behavior: accesses field outside of valid memory area.
unsafe { &(*(0 as *const $ty)).$field as *const _ as usize }
}
}
fn main() {
let p: *const Baz = 0x1248 as *const _;
let p2: *const Foo = ((p as usize) - offset_of!(Foo, memberB)) as *const _;
println!("{:p}", p2);
}
We can see on the computation of p2
that a pointer can be converted painless to an integer (usize
here), on which arithmetic is performed, and then the result is cast back to a pointer.
isize
and usize
are the universal byte-sized pointer types :)
Were RFC-2582 to be accepted, this implementation of offset_of!
is my best shot:
macro_rules! offset_of {
($ty:ty, $field:ident) => {
unsafe {
// Create correctly sized storage.
//
// Note: `let zeroed: $ty = ::std::mem::zeroed();` is incorrect,
// a zero pattern is not always a valid value.
let buffer = ::std::mem::MaybeUninit::<$ty>::uninit();
// Create a Raw reference to the storage:
// - Alignment does not matter, though is correct here.
// - It safely refers to uninitialized storage.
//
// Note: using `&raw const *(&buffer as *const _ as *const $ty)`
// is incorrect, it creates a temporary non-raw reference.
let uninit: &raw const %ty = ::std::mem::transmute(&buffer);
// Create a Raw reference to the field:
// - Alignment does not matter, though is correct here.
// - It points within the memory area.
// - It safely refers to uninitialized storage.
let field = &raw const uninit.$field;
// Compute the difference between pointers.
(field as *const _ as usize) - (uninit as *const_ as usize)
}
}
}
I have commented each step with the reasons I believe they are sound, and why some alternatives are not -- something I encourage heavily in unsafe code -- and hopefully not missed anything.
Thanks to @Matthieu M.'s answer, this can be done using pointer offsets, heres a reusable macro:
macro_rules! offset_of {
($ty:ty, $field:ident) => {
&(*(0 as *const $ty)).$field as *const _ as usize
}
}
macro_rules! check_type_pair {
($a:expr, $b:expr) => {
if false {
let _type_check = if false {$a} else {$b};
}
}
}
macro_rules! parent_of_mut {
($child:expr, $ty:ty, $field:ident) => {
{
check_type_pair!(&(*(0 as *const $ty)).$field, &$child);
let offset = offset_of!($ty, $field);
&mut *(((($child as *mut _) as usize) - offset) as *mut $ty)
}
}
}
macro_rules! parent_of {
($child:expr, $ty:ty, $field:ident) => {
{
check_type_pair!(&(*(0 as *const $ty)).$field, &$child);
let offset = offset_of!($ty, $field);
&*(((($child as *const _) as usize) - offset) as *const $ty)
}
}
}
This way, when we have a field in a struct, we can get the parent struct like this:
fn some_method(&self) {
// Where 'self' is ParentStruct.field,
// access ParentStruct instance.
let parent = unsafe { parent_of!(self, ParentStruct, field) };
}
The macro check_type_pair
helps avoid simple mistakes where self
and ParentStruct.field
aren't the same type. However its not foolproof when two different members in a struct have the same type.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With