Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Rust use two bytes to represent this enum when only one is necessary?

It appears to be smart enough to only use one byte for A, but not smart enough to use one byte for B, even though there are only 8*8=64 possibilities. Is there any way to coax Rust to figure this out or do I have to manually implement a more compact layout?

Playground link.

#![allow(dead_code)]

enum A {
    L,
    UL,
    U,
    UR,
    R,
    DR,
    D,
    DL,
}

enum B {
    C(A, A),
}

fn main() {
    println!("{:?}", std::mem::size_of::<A>()); // prints 1
    println!("{:?}", std::mem::size_of::<B>()); // prints 2
}
like image 252
Joseph Garvin Avatar asked Dec 08 '22 12:12

Joseph Garvin


1 Answers

Both bytes are necessary to preserve the ability to borrow struct members.

A type in Rust is not an ideal set of values: it has a data layout, which describe how the values are stored. One of the "rules" governing the language is that putting a type inside a struct or enum doesn't change its data layout: it has the same layout inside another type as it does standalone, which allows you to take references to struct members and use them interchangeably with any other reference.*

There's no way to fit two As into one byte while satisfying this constraint, because the size of A is one whole byte -- you can't address a part of a byte, even with repr(packed). The unused bits just remain unused (unless they can be repurposed to store the enum tag by niche-filling).

*Well, repr(packed) can actually make this untrue. Taking a reference to a packed field can cause undefined behavior, even in safe code!

like image 130
trent Avatar answered Feb 02 '23 03:02

trent