I'm surprised to find out that this compiles:
enum Foo {
case one([Foo])
// or
// case one([String: Foo])
}
I would have expected the compiler to tell me to add indirect
to the enum cases, as Foo
contains more Foo
s. Array
s and Dictionary
s are all value types, so there are no indirection here. To me, this is structurally similar to this:
enum Foo {
case one(Bar<Foo>)
}
// arrays and dictionaries are just a struct with either something or nothing in it
// right?
struct Bar<T> {
let t: T?
}
... which does require me to add indirect
to the enum.
Then I tried tuples:
enum Foo {
case one((Foo, Foo))
}
and this is considered a recursive enum too. I thought arrays and tuples have the same memory layout, and that is why you can convert arrays to tuples using withUnsafeBytes
and then binding the memory...
What rules does this follow? Are there actually something special about arrays and dictionaries? If so, are there other such "special" types that looks like it should require indirect
, but actually doesn't?
The Swift Guide is not very helpful. It just says:
A recursive enumeration is an enumeration that has another instance of the enumeration as the associated value for one or more of the enumeration cases. You indicate that an enumeration case is recursive by writing indirect before it, which tells the compiler to insert the necessary layer of indirection.
What "has another instance of the enumeration" means is rather vague. It is even possible to interpret it to mean that case one(Bar<Foo>)
is not recursive - the case only has an instance of Bar<Foo>
, not Foo
.
I think part of the confusion stems from this assumption:
I thought arrays and tuples have the same memory layout, and that is why you can convert arrays to tuples using withUnsafeBytes and then binding the memory...
Arrays and tuples don't have the same memory layout:
Array<T>
is a fixed-size struct
with a pointer to a buffer which holds the array elements contiguously* in memory
NSArray
instances do not guarantee that their underlying storage is contiguous, but in the end this does not have an effect on the code below.The key thing is that the size of an Array<T>
does not change with the number of elements held (its size is simply the size of a pointer to the buffer), while a tuple does. The tuple is more equivalent to the buffer the array holds, and not the array itself.
Array<T>.withUnsafeBytes
calls Array<T>.withUnsafeBufferPointer
, which returns the pointer to the buffer, not to the array itself. *(In the case of a non-contiguous bridged NSArray
, _ArrayBuffer.withUnsafeBufferPointer
has to create a temporary contiguous copy of its contents in order to return a valid buffer pointer to you.)
When laying out memory for types, the compiler needs to know how large the type is. Given the above, an Array<Foo>
is statically known to be fixed in size: the size of one pointer (to a buffer elsewhere in memory).
Given
enum Foo {
case one((Foo, Foo))
}
in order to lay out the size of Foo
, you need to figure out the maximum size of all of its cases. It has only the single case, so it would be the size of that case.
Figuring out the size of one
requires figuring out the size of its associated value, and the size of a tuple of elements is the sum of the size of the elements themselves (taking into account padding and alignment, but we don't really care about that here).
Thus, the size of Foo
is the size of one
, which is the size of (Foo, Foo)
laid out in memory. So, what is the size of (Foo, Foo)
? Well, it's the size of Foo
+ the size of Foo
... each of which is the size of Foo
+ the size of Foo
... each of which is the size of Foo
+ the size of Foo
...
Where Array<Foo>
had a way out (Array<T>
is the same size regardless of T
), we're stuck in an infinite loop with no base case.
indirect
is the keyword required to break out of the recursion and give this infinite reference a base case. It inserts an implicit pointer by making a given case
the fixed size of a pointer, regardless of what it contains or points to. That makes the size of one
fixed, which allows Foo
to have a fixed size.
indirect
is less about Foo
referring to Foo
in any way, and more about allowing an enum
case to potentially contain itself indirectly (because direct containment would lead to an infinite loop).
As an aside, this is also why a struct
cannot contain a direct instance of itself:
struct Foo {
let foo: Foo // error: Value type 'Foo' cannot have a stored property that recursively contains it
}
would lead to infinite recursion, while
struct Foo {
let foo: UnsafePointer<Foo>
}
is fine.
struct
s don't support the indirect
keyword (at least in a struct
, you have more direct control over storage and layout), but there have been pitches for adding support for this on the Swift forums.
Indirect on enum cases can be used to tell the compiler to store the associated value of that enum as a pointer. Reference.
This is necessary when the compiler couldn't figure out the memory layout of the associated value without using a pointer, such as in the case of recursive enums.
The reason why you need to make the below case indirect
is because by using a generic struct that holds your enum as an associated value on a case of the enum itself, you end up with a recursive enum, since you can nest Foo
and Bar
in each other infinitely many times.
enum MaybeRecursive {
case array([MaybeRecursive])
case dict([String: MaybeRecursive])
indirect case genericStruct(GenericStruct<MaybeRecursive>)
}
struct GenericStruct<T> {
let t: T?
}
// if you replaced `.array` with `.genericStruct` again, you could keep going forever
let associatedValue = GenericStruct(t: MaybeRecursive.genericStruct(.init(t: .array([]))))
MaybeRecursive.genericStruct(associatedValue)
As for the Array
vs Tuple
difference, Itai Ferber has expertly explained that part in their own answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With