I read documentation about size of enums in Swift and here is my understanding:
This simple one only hold a 'tag' to differentiate cases, which is by default an UInt8
value, i.e. small = 0
, medium = 1
and so on. So, Size
's size is 1 byte, which can be verified with MemoryLayout<Size>.size
. I also noted that if an enum has more than 255 cases, obviously the tag size is upgraded to 2 bytes.
enum Size {
case small
case medium
case large
}
Second case, if an enum has associated values it behaves like a union. In this case the enum size is the size of the tag plus the size of the largest associated value. In the following example the size is 1 byte + 16 bytes (String) so 17 bytes, which can also be verified with MemoryLayout
.
enum Value {
case int(Int)
case double(Double)
case string(String)
case bool(Bool)
}
Last case, as Swift is a safe language references are always valid using standard non-unsafe Swift code, i.e always pointing to a value in memory. This allows the compiler to optimise such enum when T
is a reference type:
enum Opt<T> {
case none
case some(T)
}
Here an instance of type T
cannot by nil
(NULL) so the compiler uses this special value for the none
case, hence Opt
is of size 8 bytes instead of 9 bytes when T
is a reference type. This optimisation is raised in this SO question about Rust which I believe has the same behaviour has Swift concerning enums.
For instance with this simple reference type, MemoryLayout
returns a size of 8 bytes:
class Person {
var name: String
init(name: String) {
self.name = name
}
}
let p = Opt.some(Person(name: "Bob")) // 8 bytes
What I cannot figure out is the size of this enum (still when T is a reference type):
enum Opt<T> {
case none
case secondNone
case some(T)
}
Why this one is also 8 bytes, according to MemoryLayout
?
In my understanding it should be 9 bytes. The NULL optimisation is only possible because none
can be represented by NULL but there is no 'second' NULL value for secondNone
in my example, so a tag should be required here to differentiate the cases.
Does the compiler automatically turns this enum into a reference type (similar to an indirect
enum) because of this? This would explain the 8 bytes size. How can I verify this last hypothese?
A Swift enum can either have raw values or associated values. Why is that? It's because of the definition of a raw value: A raw value is something that uniquely identifies a value of a particular type. “Uniquely” means that you don't lose any information by using the raw value instead of the original value.
In Swift enum, we learned how to define a data type that has a fixed set of related values. However, sometimes we may want to attach additional information to enum values. These additional information attached to enum values are called associated values.
We can tell that enum holds a strong reference to its associated values based on the fact that Swift Optional implements using enum, and it can hold any object reference without releasing it.
Types in Swift fall into one of two categories: first, “value types”, where each instance keeps a unique copy of its data, usually defined as a struct, enum, or tuple. The second, “reference types”, where instances share a single copy of the data, and the type is usually defined as a class.
From Type Layout: Single-Payload Enums:
If the data type's binary representation has extra inhabitants, that is, bit patterns with the size and alignment of the type but which do not form valid values of that type, they are used to represent the no-data cases, with extra inhabitants in order of ascending numeric value matching no-data cases in declaration order.
Your example with more cases:
enum Opt<T> {
case a, b, c, d, e, f, g, h, i, j, k
case l, m, n, o, p, q, r, s, t, u, v
case some(T)
}
class Person {
var name: String
init(name: String) { self.name = name }
}
print(unsafeBitCast(Opt<Person>.a, to: UnsafeRawPointer.self))
// 0x0000000000000000
print(unsafeBitCast(Opt<Person>.b, to: UnsafeRawPointer.self))
// 0x0000000000000002
print(unsafeBitCast(Opt<Person>.v, to: UnsafeRawPointer.self))
// 0x000000000000002a
let p = Person(name: "Bob")
print(unsafeBitCast(Opt.some(p), to: UnsafeRawPointer.self))
// 0x00006030000435d0
Apparently, 0x0
, 0x2
, ..., 0x2a
are invalid bit patterns for a pointer, and therefore used for the additional cases.
The precise algorithm seems to be undocumented, one probably would have to inspect the Swift compiler source code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With