I read about collections in .NET nowadays. As known, there is some advantages using generic collections over non-generic: they are type-safety and there is no casting, no boxing/unboxing. That's why generic collections have a performance advantage. If we consider that non-generic collections store every member as <code>object</code>, then we can think that generics have also memory advantage. However, I didn't found any information about memory usage difference. Can anyone clarify about the point?

<blockquote> then we can think that generics have also memory advantage </blockquote> This assumption is false, it only applies on value-types. So considder this: <pre class="prettyprint"><code>new ArrayList { 1, 2, 3 }; </code></pre> This will implicetly cast every integer into <code>object</code> (known as boxing) in order to store it into your <code>ArrayList</code>. This will cause your memory-overhead here, because an <code>object</code> surely is bigger than a simple <code>int</code>. For reference-types there´s no difference however as there´s no need for boxing. Using the one or the other shouldn´t be driven bei neither any performance- nor memory-issues. However you should ask yourself what you want to do with the results. In particular if you know the type(s) stored in your collection at compile-time, there´s no reason to not put this information into the compile-process by using the right generic type-argument. Anyway you should allways use generic collections instead of non-generic ones because of the mentioned type-safety. EDIT: Your actual question if using a non-generic collection or a generic version is quite pointless: allways use the generic one. But not because of its memory-usage. See this: <pre class="prettyprint"><code>ArrayList a = new ArrayList { 1, 2, 3}; </code></pre> vs. <pre class="prettyprint"><code>List<object> a = new List<object> { 1, 2, 3 }; </code></pre> Both lists will consume same amount of memory, although the second one is generic. That´s because they both box your integers into <code>object</code>. So the answer to the question has nothing to do with memory. On te other saying for reference-types there´s no memory-differencee at all: <pre class="prettyprint"><code>ArrayList a = new ArrayList { myInstance, anotherInstance } </code></pre> vs. <pre class="prettyprint"><code>List<MyClass> a = new List<MyClass> { myInstance, anotherInstance } </code></pre> will produce the same memory-outcome. However the second one is far easier to maintain as you can work with the instances directly without casting them.

Memory usage difference between Generic and Non-generic collections in .NET

Tags:

c#

.net

collections

memory

generics

I read about collections in .NET nowadays. As known, there is some advantages using generic collections over non-generic: they are type-safety and there is no casting, no boxing/unboxing. That's why generic collections have a performance advantage.

If we consider that non-generic collections store every member as object, then we can think that generics have also memory advantage. However, I didn't found any information about memory usage difference.

Can anyone clarify about the point?

540

asked Nov 08 '17 14:11

Ilkin Sam Elimov

2 Answers

If we consider that non-generic collections store every member as object, then we can think that generics have also memory advantage. However, I didn't found any information about memory usage difference. Can anyone clarify about the point?

Sure. Let's consider an ArrayList that contains ints vs a List<int>. Let's suppose there are 1000 ints in each list.

In both, the collection type is a thin wrapper around an array -- hence the name ArrayList. In the case of ArrayList, there's an underlying object[] that contains at least 1000 boxed ints. In the case of List<int>, there's an underlying int[] that contains at least 1000 ints.

Why did I say "at least"? Because both use a double-when-full strategy. If you set the capacity of a collection when you create it then it allocates enough space for that many things. If you don't, then the collection has to guess, and if it guesses wrong and you need more capacity, then it doubles its capacity. So, best case, our collection arrays are exactly the right size. Worst case, they are possibly twice as big as they need to be; there could be room for 2000 objects or 2000 ints in the arrays.

But let's suppose for simplicity that we're lucky and there are about 1000 in each.

To start with, what's the memory burden of just the array? An object[1000] takes up 4000 bytes on a 32 bit system and 8000 bytes on a 64 bit system, just for the references, which are pointer sized. An int[1000] takes up 4000 bytes regardless. (There are also a few extra bytes taken up by array bookkeeping, but these costs are small compared to the marginal costs.)

So already we see that the non-generic solution possibly consumes twice as much memory just for the array. What about the contents of the array?

Well, the thing about value types is they are stored right there in their own variable. There is no additional space beyond those 4000 bytes used to store the 1000 integers; they get packed right into the array. So the additional cost is zero for the generic case.

For the object[] case, each member of the array is a reference, and that reference refers to an object; in this case, a boxed integer. What's the size of a boxed integer?

An unboxed value type doesn't need to store any information about its type, because its type is determined by the type of the storage its in, and that's known to the runtime. A boxed value type needs to somewhere store the type of the thing in the box, and that takes space. It turns out that the bookkeeping overhead for an object in 32 bit .NET is 8 bytes, and 16 on 64 bit systems. That's just the overhead; we of course need 4 bytes for the int. But wait, it gets worse: on 64 bit systems, the box must be aligned to an 8 byte boundary, so we need another 4 bytes of padding on 64 bit systems.

Add it all up: Our int[] takes about 4KB on both 64 and 32 bit systems. Our object[] containing 1000 ints takes about 16KB on 32 bit systems, and 32K on 64 bit systems. So the memory efficiency of an int[] vs an object[] is either 4 or 8 times worse for the non-generic case.

But wait, it gets even worse. That's just size. What about access time?

To access an integer from an array of integers, the runtime must:

verify that the array is valid
verify that the index is valid
fetch the value from the variable at the given index

To access an integer from an array of boxed integers, the runtime must:

verify that the array is valid
verify that the index is valid
fetch the reference from the variable at the given index
verify that the reference is not null
verify that the reference is a boxed integer
extract the integer from the box

That's a lot more steps, so it takes a lot longer.

BUT WAIT IT GETS WORSE.

Modern processors use caches on the chip itself to avoid going back to main memory. An array of 1000 plain integers is highly likely to end up in the cache so that accesses to the first, second, third, etc, members of the array in quick succession are all pulled from the same cache line; this is insanely fast. But boxed integers can be all over the heap, which increases cache misses, which greatly slows down access even further.

Hopefully that sufficiently clarifies your understanding of the boxing penalty.

What about non-boxed types? Is there a significant difference between an array list of strings, and a List<string>?

Here the penalty is much, much smaller, since an object[] and a string[] have similar performance characteristics and memory layouts. The only additional penalty in this case is (1) not catching your bugs until runtime, (2) making the code harder to read and edit, and (3) the slight penalty of a run-time type check.

105

answered Nov 14 '22 23:11

Eric Lippert

then we can think that generics have also memory advantage

This assumption is false, it only applies on value-types. So considder this:

new ArrayList { 1, 2, 3 };

This will implicetly cast every integer into object (known as boxing) in order to store it into your ArrayList. This will cause your memory-overhead here, because an object surely is bigger than a simple int.

For reference-types there´s no difference however as there´s no need for boxing.

Using the one or the other shouldn´t be driven bei neither any performance- nor memory-issues. However you should ask yourself what you want to do with the results. In particular if you know the type(s) stored in your collection at compile-time, there´s no reason to not put this information into the compile-process by using the right generic type-argument.

Anyway you should allways use generic collections instead of non-generic ones because of the mentioned type-safety.

EDIT: Your actual question if using a non-generic collection or a generic version is quite pointless: allways use the generic one. But not because of its memory-usage. See this:

ArrayList a = new ArrayList { 1, 2, 3};

vs.

List<object> a = new List<object> { 1, 2, 3 };

Both lists will consume same amount of memory, although the second one is generic. That´s because they both box your integers into object. So the answer to the question has nothing to do with memory.

On te other saying for reference-types there´s no memory-differencee at all:

ArrayList a = new ArrayList { myInstance, anotherInstance }

vs.

List<MyClass> a = new List<MyClass> { myInstance, anotherInstance }

will produce the same memory-outcome. However the second one is far easier to maintain as you can work with the instances directly without casting them.

answered Nov 14 '22 23:11

MakePeaceGreatAgain

Related questions
                            
                                Hangfire RecurringJob with dependency injection
                            
                                C# - Getting file names starting with a specific format in a directory
                            
                                Why I getting an "Expression always true" warning?
                            
                                Calling method of non-assigned class
                            
                                Android Create Calendar Event always as Birthday
                            
                                Unit test to verify that a base class method is called
                            
                                How to encrypt / decrypt a configuration file section with RsaProtectedConfigurationProvider
                            
                                Convert.ToDateTime('Datestring') to required dd-MM-yyyy format of date
                            
                                NullReferenceException vs ArgumentNullException
                            
                                Can I guarantee that a non-ref method doesn't change the array size C#
                            
                                How to inject DbContext instance in the ConfigureServices method of startup.cs correctly (ASP.net core 1.1)?
                            
                                Async socket client using TcpClient class C#
                            
                                Compiler warning when using const boolean
                            
                                MVC5 Where to put authentication forms in web.config?
                            
                                Parsing Date Time in C#, Getting wrong output
                            
                                getting a list of installed browsers on the computer
                            
                                remove duplicate items from list in c#
                            
                                C# new operate bug?
                            
                                Fluent Validation changing CustomAsync to MustAsync
                            
                                Why is float.min different between C++ and C#?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With