Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Swift enum size when associated value is a reference type

I read documentation about size of enums in Swift and here is my understanding:

This simple one only hold a 'tag' to differentiate cases, which is by default an UInt8 value, i.e. small = 0, medium = 1 and so on. So, Size's size is 1 byte, which can be verified with MemoryLayout<Size>.size. I also noted that if an enum has more than 255 cases, obviously the tag size is upgraded to 2 bytes.

enum Size {
    case small
    case medium
    case large
}

Second case, if an enum has associated values it behaves like a union. In this case the enum size is the size of the tag plus the size of the largest associated value. In the following example the size is 1 byte + 16 bytes (String) so 17 bytes, which can also be verified with MemoryLayout.

enum Value {
    case int(Int)
    case double(Double)
    case string(String)
    case bool(Bool)
}

Last case, as Swift is a safe language references are always valid using standard non-unsafe Swift code, i.e always pointing to a value in memory. This allows the compiler to optimise such enum when T is a reference type:

enum Opt<T> {
    case none
    case some(T)
}

Here an instance of type T cannot by nil (NULL) so the compiler uses this special value for the none case, hence Opt is of size 8 bytes instead of 9 bytes when T is a reference type. This optimisation is raised in this SO question about Rust which I believe has the same behaviour has Swift concerning enums.

For instance with this simple reference type, MemoryLayout returns a size of 8 bytes:

class Person {
    var name: String

    init(name: String) {
        self.name = name
    }
}

let p = Opt.some(Person(name: "Bob"))  // 8 bytes

Question

What I cannot figure out is the size of this enum (still when T is a reference type):

enum Opt<T> {
    case none
    case secondNone
    case some(T)
}

Why this one is also 8 bytes, according to MemoryLayout?

In my understanding it should be 9 bytes. The NULL optimisation is only possible because none can be represented by NULL but there is no 'second' NULL value for secondNone in my example, so a tag should be required here to differentiate the cases.

Does the compiler automatically turns this enum into a reference type (similar to an indirect enum) because of this? This would explain the 8 bytes size. How can I verify this last hypothese?

like image 783
Louis Lac Avatar asked May 28 '20 10:05

Louis Lac


People also ask

Why Swift enums with associated values Cannot have a raw value?

A Swift enum can either have raw values or associated values. Why is that? It's because of the definition of a raw value: A raw value is something that uniquely identifies a value of a particular type. “Uniquely” means that you don't lose any information by using the raw value instead of the original value.

What is associated value in enum Swift?

In Swift enum, we learned how to define a data type that has a fixed set of related values. However, sometimes we may want to attach additional information to enum values. These additional information attached to enum values are called associated values.

Do enums have strong or weak references in memory?

We can tell that enum holds a strong reference to its associated values based on the fact that Swift Optional implements using enum, and it can hold any object reference without releasing it.

Are enums value types Swift?

Types in Swift fall into one of two categories: first, “value types”, where each instance keeps a unique copy of its data, usually defined as a struct, enum, or tuple. The second, “reference types”, where instances share a single copy of the data, and the type is usually defined as a class.


1 Answers

From Type Layout: Single-Payload Enums:

If the data type's binary representation has extra inhabitants, that is, bit patterns with the size and alignment of the type but which do not form valid values of that type, they are used to represent the no-data cases, with extra inhabitants in order of ascending numeric value matching no-data cases in declaration order.

Your example with more cases:

enum Opt<T> {
    case a, b, c, d, e, f, g, h, i, j, k
    case l, m, n, o, p, q, r, s, t, u, v
    case some(T)
}

class Person {
    var name: String
    init(name: String) { self.name = name }
}

print(unsafeBitCast(Opt<Person>.a, to: UnsafeRawPointer.self))
// 0x0000000000000000

print(unsafeBitCast(Opt<Person>.b, to: UnsafeRawPointer.self))
// 0x0000000000000002

print(unsafeBitCast(Opt<Person>.v, to: UnsafeRawPointer.self))
// 0x000000000000002a

let p = Person(name: "Bob")
print(unsafeBitCast(Opt.some(p), to: UnsafeRawPointer.self))
// 0x00006030000435d0

Apparently, 0x0, 0x2, ..., 0x2a are invalid bit patterns for a pointer, and therefore used for the additional cases.

The precise algorithm seems to be undocumented, one probably would have to inspect the Swift compiler source code.

like image 82
Martin R Avatar answered Nov 15 '22 12:11

Martin R