Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deal with the immutability of returned structs?

I'm writing a game that has a huge 2D array of "cells". A cell takes only 3 bytes. I also have a class called CellMap, which contains the 2D array as a private field, and provides access to it via a public indexer.

Profiling showed that a performance problem is caused by garbage collection of too many Cell objects. So I decided to make Cell a struct (it was a class).

But now code like this doesn't work:

cellMap[x, y].Population++;

I can think of many options, but I don't really like any of them.

  1. Make the array public, and write cellMap.Data[x, y].Population = 5;
  2. Stop using a CellMap class, and just use a 2D array directly. But CellMap is very convenient because it implements its own optimized serialization, and it exposes Width and Height properties that are more convenient than writing cellMap.GetLength(0)
  3. Make Cell immutable. But then how would the code look? cellMap[x, y] = IncrementCellPopulation(cellMap[x, y])? Very verbose.
  4. A couple of utility functions like cellMap.SetPopulationAt(x, y, 5)
  5. In every class that owns a CellMap, add a utility property like private Cell[,] CellData { get { return this.CellMap.GetInternalArray(); } }, so then my code can look like CellData[x, y].Population++

How is this problem traditionally solved?

like image 463
Stefan Monov Avatar asked Sep 01 '10 13:09

Stefan Monov


People also ask

Should structs be immutable?

A struct type is not immutable. Yes, strings are. Making your own type immutable is easy, simply don't provide a default constructor, make all fields private and define no methods or properties that change a field value. Have a method that should mutate the object return a new object instead.

Why are mutable structs evil?

Claiming mutable structs are evil is like claiming mutable int s, bool s, and all other value types are evil. There are cases for mutability and for immutability. Those cases hinge on the role the data plays, not the type of memory allocation/sharing.

Why are structs immutable C#?

Struct is also a ValueType which inherits same. It's a good idea to make struct immutable, it means once it is initalized, it cannot be modified. System. DateTime is an immutable Value Type.


1 Answers

So there are actually two problems here. There's the question you actually asked: what are techniques to deal with the fact that structs ought to be immutable because they are copied by value, but you want to mutate one. And then there's the question which is motivating this one, which is "how can I make the performance of my program acceptable?"

My other answer addresses the first question, but the second question is interesting as well.

First off, if the profiler has actually identified that the performance problem is due to garbage collection of cells, then it is possible that making cell into a struct will help. It is also possible that it will not help at all, and it is possible that doing so will make it worse.

Your cells do not contain any reference types; we know this because you've said they are only three bytes. If someone else reading this is thinking that they could make a performance optimization by turning a class into a struct then it might not help at all because the class might contain a field of reference type, in which case the garbage collector still has to collect every instance, even if it is turned into a value type. The reference types in it need to be collected too! I would only recommend attempting this for performance reasons if Cell contains only value types, which apparently it does.

It might make it worse because value types are not a panacea; they have costs too. Value types are often more expensive to copy than reference types (which are pretty much always the size of a register, almost always aligned on the appropriate memory boundary, and therefore the chip is highly optimized for copying them). And value types are copied all the time.

Now, in your case you have a struct which is smaller than a reference; references are four or eight bytes typically. And you're putting them in an array, which means that you are packing the array down; if you have a thousand of them, it'll take three thousand bytes. Which means that three out of every four structs in there are misaligned, meaning more time (on many chip architectures) to get the value out of the array. You might consider measuring the impact of padding your struct out to four bytes to see if that makes a difference, provided you're still going to keep them in an array, which brings me to my next point...

The Cell abstraction might simply be a bad abstraction for the purpose of storing data about lots of cells. If the problem is that Cells are classes, you're keeping an array of thousands of Cells, and collecting them is expensive, then there are solutions other than making Cell into a struct. Suppose for example that a Cell contains two bytes of Population and one byte of Color. That is the mechanism of Cell, but surely that is not the interface you want to expose to the users. There is no reason why your mechanism has to use the same type as the interface. And therefore you could manufacture instances of the Cell class on demand:

interface ICell
{
   public int Population { get; set; }
   public Color Color { get; set; }
}
private class CellMap
{
    private ushort[,] populationData; // Profile the memory burden vs speed cost of ushort vs int
    private byte[,] colorData; // Same here. 
    public ICell this[int x, int y] 
    {
        get { return new Cell(this, x, y); }
    }

    private sealed class Cell : ICell
    {
        private CellMap map;
        private int x;
        private int y;
        public Cell(CellMap map, int x, int y)
        {
            this.map = map; // etc
        }
        public int Population  
        {
            get { return this.map.populationData[this.x, this.y]; } 
            set { this.map.populationData[this.x, this.y] = (ushort) value; } 
        }

and so on. Manufacture the cells on demand. They will almost immediately be collected if they are short-lived. CellMap is an abstraction, so use the abstraction to hide the messy implementation details.

With this architecture you don't have any garbage collection problems because you have almost no live Cell instances, but you can still say

map[x,y].Population++;

no problem, because the first indexer manufactures an immutable object which knows how to update the state of the map. The Cell doesn't need to be mutable; notice that the Cell class is completely immutable. (Heck, the Cell could be a struct here, though of course casting it to ICell would just box it anyway.) It is the map which is mutable, and the cell mutates the map for the user.

like image 154
Eric Lippert Avatar answered Sep 21 '22 05:09

Eric Lippert