We all know mutable structs are evil in general. I'm also pretty sure that because IEnumerable<T>.GetEnumerator()
returns type IEnumerator<T>
, the structs are immediately boxed into a reference type, costing more than if they were simply reference types to begin with.
So why, in the BCL generic collections, are all the enumerators mutable structs? Surely there had to have been a good reason. The only thing that occurs to me is that structs can be copied easily, thus preserving the enumerator state at an arbitrary point. But adding a Copy()
method to the IEnumerator
interface would have been less troublesome, so I don't see this as being a logical justification on its own.
Even if I don't agree with a design decision, I would like to be able to understand the reasoning behind it.
Class instances each have an identity and are passed by reference, while structs are handled and mutated as values. Basically, if we want all of the changes that are made to a given object to be applied the same instance, then we should use a class — otherwise a struct will most likely be a more appropriate choice.
In classes, two variables can contain the reference of the same object and any operation on one variable can affect another variable. In this way, struct should be used only when you are sure that, It logically represents a single value, like primitive types (int, double, etc.). It is immutable.
Structures (also called structs) are a way to group several related variables into one place. Each variable in the structure is known as a member of the structure. Unlike an array, a structure can contain many different data types (int, float, char, etc.).
The struct (structure) is like a class in C# that is used to store data. However, unlike classes, a struct is a value type. Suppose we want to store the name and age of a person. We can create two variables: name and age and store value.
Indeed, it is for performance reasons. The BCL team did a lot of research on this point before deciding to go with what you rightly call out as a suspicious and dangerous practice: the use of a mutable value type.
You ask why this doesn't cause boxing. It's because the C# compiler does not generate code to box stuff to IEnumerable or IEnumerator in a foreach loop if it can avoid it!
When we see
foreach(X x in c)
the first thing we do is check to see if c has a method called GetEnumerator. If it does, then we check to see whether the type it returns has method MoveNext and property current. If it does, then the foreach loop is generated entirely using direct calls to those methods and properties. Only if "the pattern" cannot be matched do we fall back to looking for the interfaces.
This has two desirable effects.
First, if the collection is, say, a collection of ints, but was written before generic types were invented, then it does not take the boxing penalty of boxing the value of Current to object and then unboxing it to int. If Current is a property that returns an int, we just use it.
Second, if the enumerator is a value type then it does not box the enumerator to IEnumerator.
Like I said, the BCL team did a lot of research on this and discovered that the vast majority of the time, the penalty of allocating and deallocating the enumerator was large enough that it was worth making it a value type, even though doing so can cause some crazy bugs.
For example, consider this:
struct MyHandle : IDisposable { ... } ... using (MyHandle h = whatever) { h = somethingElse; }
You would quite rightly expect the attempt to mutate h to fail, and indeed it does. The compiler detects that you are trying to change the value of something that has a pending disposal, and that doing so might cause the object that needs to be disposed to actually not be disposed.
Now suppose you had:
struct MyHandle : IDisposable { ... } ... using (MyHandle h = whatever) { h.Mutate(); }
What happens here? You might reasonably expect that the compiler would do what it does if h were a readonly field: make a copy, and mutate the copy in order to ensure that the method does not throw away stuff in the value that needs to be disposed.
However, that conflicts with our intuition about what ought to happen here:
using (Enumerator enumtor = whatever) { ... enumtor.MoveNext(); ... }
We expect that doing a MoveNext inside a using block will move the enumerator to the next one regardless of whether it is a struct or a ref type.
Unfortunately, the C# compiler today has a bug. If you are in this situation we choose which strategy to follow inconsistently. The behaviour today is:
if the value-typed variable being mutated via a method is a normal local then it is mutated normally
but if it is a hoisted local (because it's a closed-over variable of an anonymous function or in an iterator block) then the local is actually generated as a read-only field, and the gear that ensures that mutations happen on a copy takes over.
Unfortunately the spec provides little guidance on this matter. Clearly something is broken because we're doing it inconsistently, but what the right thing to do is not at all clear.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With