How come my class take so much space in memory?

Tags:

I will have literally tens of millions of instances of some class MyClass and want to minimize its memory size. The question of measuring how much space an object takes in the memory was discussed in Find out the size of a .net object I decided to follow Jon Skeet's suggestion, and this is my code:

   // Edit: This line is "dangerous and foolish" :-) 
   // (However, commenting it does not change the result)
   // [StructLayout(LayoutKind.Sequential, Pack = 1)]
   public class MyClass       
   {
      public bool isit;
      public MyClass nextRight;
      public MyClass nextDown;
   }

   class Program
   {
      static void Main(string[] args)
      {
         var a1 = new MyClass(); //to prevent JIT code mangling the result (Skeet)
         var before = GC.GetTotalMemory(true);   
         MyClass[] arr = new MyClass[10000];
         for (int i = 0; i < 10000; i++)
            arr[i] = new MyClass(); 

         var after = GC.GetTotalMemory(true);

         var per = (after - before) / 10000.0;
         Console.WriteLine("Before: {0} After: {1} Per: {2}", before, after, per);
         Console.ReadLine();
      }
   }

I run the program on 64 bit Windows, Choose "release", platform target: "any cpu", and choose "optimize code" (The options only matter if I explicitly target x86) The result is, sadly, 48 bytes per instance.

My calculation would be 8 bytes per reference, plus 1 byte for bool plus some ~8byte overhead. What is going on? Is this a conspiracy to keep RAM prices high and/or let non-Microsoft code bloat? Well, ok, I guess my real question is: what am I doing wrong, or how can I minimize the size of MyClass?

Edit: I apologize for being sloppy in my question, I edited a couple of identifier names. My concrete and immediate concern was to build a "2-dim linked-list" as a sparse boolean matrice implementation, where I can get an enumeration of set values in a given row/column easily. [Of course that means I have to also store the x,y coordinates on the class, which makes my idea even less feasible]

621

asked Jan 17 '12 16:01

Ali Ferhat

1 Answers

Approach the problem from the other end. Rather than asking yourself "how can I make this data structure smaller and still have tens of millions of them allocated?" ask yourself "how can I represent this data using a completely different data structure that is far more compact?"

It looks like you are building a doubly-linked list of bools, which, as you note, uses thirty to fifty times more memory than it needs to. Is there some reason why you're not simply using a BitArray to store your list of bools?

UPDATE:

in fact I was trying to implement a sparse boolean two-dimensional matrix

Well why didn't you say so in the first place?

When I want to make a sparse Boolean two-d matrix of enormous size, I build an immutable persistent boolean quadtree with a memoized factory. If the array is sparse, or even if it is dense but self-similar in some way, you can achieve enormous compressions. Square arrays of 2⁶⁴ x 2⁶⁴ Booleans are easily representable even though obviously as a real array, that would be more memory than exists in the world.

I have been toying with the idea of doing a series of blog articles on this technique; I will likely do so in late March. (UPDATE: I did not write that article in March 2012; I wrote it in August 2020. https://ericlippert.com/2020/08/17/life-part-32/)

Briefly, the idea is to make an abstract class Quad that has two subclasses: Single, and Multi. "Single" is a doubleton -- like a singleton, but with exactly two instances, called True and False. A Multi is a Quad that has four sub-quads, called NorthEast, SouthEast, SouthWest and NorthWest.

Each Quad has an integer "level"; the level of a Single is zero, and a multi of level n is required to have all of its children be Quads of level n-1.

The Multi factory is memoized; when you ask it to make a new Multi with four children, it consults a cache to see if it has made it before. If it has, it does not construct a new one; it hands out the old one. Since Quads are immutable, you do not have to worry about someone changing the Quad on you after it is in the cache.

Consider now how many memory words (a word is 4 or 8 bytes depending on architecture) an "all false" Multi of level n consumes. A level 1 "all false" multi consumes four words for the links to its children, a word for the level count (if necessary; you are not required to keep the level in the multi, though it helps for debugging) and a couple words for the sync block and so on. Let's call it eight words. (Plus the memory for the False Single quad, which we can assume is a constant two or three words, and thereby may be ignored.)

A level 2 "all false" multi consumes the same eight words, but each of its four children is the same level 1 multi. Therefore the total consumption of the level 2 "all false" multi is let's say 16 words.

The same for the level 3, 4,... and so on. The total memory consumption for a level 64 multi that is logically a 2⁶⁴ x 2⁶⁴ square array of Booleans is only 64 x 16 memory words!

Make sense? Hopefully that is enough of a sketch to get you going. If not, see my blog link above.

answered Oct 17 '22 18:10

Eric Lippert

Related questions
                            
                                How to exclude multiple properties in FluentAssertions ShouldBeEquivalentTo()
                            
                                ByteArray to IFormFile
                            
                                JWT token error 401 Unauthorized in .net core 3.1
                            
                                asp.net Convert CSV string to string[]
                            
                                Performance of nested yield in a tree
                            
                                Pass a method as an argument
                            
                                How do I call an event method in C#?
                            
                                C# Load integers and display odd / even
                            
                                Why can't I use While(1) in C#?
                            
                                C# - Multiple TCP connections on one port?
                            
                                Speed of different constructs in programming languages (Java/C#/C++/Python/…)
                            
                                Why aren't balloon tips shown pointing at the correct control?
                            
                                How to create an array of List<int> in C#?
                            
                                LINQ to SQL Join issues
                            
                                Good practices when handling Exceptions in C#
                            
                                Windows Forms GUI hangs when calling OpenFileDialog.ShowDialog()
                            
                                How to programatically select first row of DataGridView [duplicate]
                            
                                DateTime 25 years back from today
                            
                                Optimizing this C# algorithm (K Difference)
                            
                                Time spend running program

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How come my class take so much space in memory?

Tags:

memory-management

c#

Ali Ferhat

People also ask

1 Answers

Eric Lippert

Recent Activity

Donate For Us