Is transmuting PhantomData markers safe?

Tags:

unsafe

This is taken out of context so it might seem a bit weird, but I have the following data structure:

use std::marker::PhantomData;

pub struct Map<T, M=()> {
    data: Vec<T>,
    _marker: PhantomData<fn(M) -> M>,
}

Map is an associative map where keys are "marked" to prevent using keys from one map on another unrelated map. Users can opt into this by passing some unique type they've made as M, for example:

struct PlayerMapMarker;
let mut player_map: Map<String, PlayerMapMarker> = Map::new();

This is all fine, but some iterators (e.g. the ones giving only values) I want to write for this map do not contain the marker in their type. Would the following transmute be safe to discard the marker?

fn discard_marker<T, M>(map: &Map<T, M>) -> &Map<T, ()> {
    unsafe { std::mem::transmute(map) }
}

So that I could write and use:

fn values(&self) -> Values<T> {
    Values { inner: discard_marker(self).iter() }
}

struct Values<'a, T> {
    inner: Iter<'a, T, ()>,
}

823

asked Oct 21 '18 03:10

orlp

1 Answers

TL;DR: Add #[repr(C)] and you should be good.

There are two separate concerns here: Whether the transmute is valid in the sense of returning valid data at the return type, and whether the entire thing violates any higher-level invariants that might be attached to the involved types. (In the terminology of my blog post, you have to make sure that both validity and safety invariants are maintained.)

For the validity invariant, you are in uncharted territory. The compiler could decide to lay out Map<T, M> very differently from Map<T, ()>, i.e. the data field could be at a different offset and there could be spurious padding. It does not seem likely, but so far we are guaranteeing very little here. Discussion about what we can and want to guarantee there is happening right now. We purposefully want to avoid making too many guarantees about repr(Rust) to avoid painting ourselves into a corner.

What you could do is to add repr(C) to your struct, then I am fairly sure you can count on ZSTs not changing anything (but I asked for clarification just to be sure). For repr(C) we provide more guarantees about how the struct is laid out, which in fact is its entire purpose. If you want to play tricks with struct layout, you should probably add that attribute.

For the higher-level safety invariant, you must be careful not to create a broken Map and let that "leak" beyond the borders of your API (into the surrounding safe code), i.e. you shouldn't return an instance of Map that violates any invariants you might have put on it. Moreover, PhantomData has some effects on variance and the drop checker that you should be aware of. With the types that are being transmuted being so trivial (your marker types don't require dropping, i.e. them and their transitive fields all do not implement Drop) I do not think you have to expect any problem from this side.

To be clear, repr(Rust) (the default) might also be fine once we decide this is something we want to guarantee -- and ignoring size-0-align-1 types (like PhantomData) entirely seems like a pretty sensible guarantee to me. Personally though I'd still advise for using repr(C) unless that has a cost you are not willing to pay (e.g. because you lose the compilers automatic size-reduction-by-reordering and cannot replicate it manually).

answered Sep 28 '22 21:09

Ralf Jung

Related questions
                            
                                How do I use Wasm in the content script of a Firefox web extension?
                            
                                Implementing a foreign trait for a local generic type
                            
                                What happens when casting a big float to a int?
                            
                                standard_init_linux.go:219: exec user process caused: no such file or directory
                            
                                How this mixed-character string split on unicode word boundaries
                            
                                How are return values of type `impl Trait` borrow-checked?
                            
                                Can't compile Rust
                            
                                Can I borrow a pointer to a shared trait in Rust?
                            
                                Dividing a const by a generic in Rust
                            
                                Extend lifetime of variable
                            
                                Closing stdout or stdin
                            
                                Pass Python list to Rust function
                            
                                More convenient way to work with strings in winapi calls
                            
                                Load a shared library linked to Rust library in R
                            
                                How should I initialize an `Arc<[u8; 65536]>` efficiently?
                            
                                Lifetime error using associated type of trait with lifetime parameter
                            
                                macro_rules! macro takes string literal "...", expands to both "..." and b"..."
                            
                                How can I get a reference to the key and value immediately after inserting into a `HashMap`?
                            
                                What's wrong with this Rust macro?
                            
                                Is it impossible to have a nested match on a recursive datatype that uses a smart pointer like a Box, Rc, or Arc?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With