I am using C#, .NET 4.0, 64-bit. I need to store in memory 500 million "data points" that are used in computations. I need to decide whether to create these as struct or class objects. Structs seem so much faster.
Is there a memory limit for the stack? If so, how can it be adjusted.
Will storing so much data on a stack affect the overall performance of the system?
(By the way, I am aware of the single-object size limitation in .NET, so that's being addressed -- the data will be stored in multiple collections).
It depends on your operating system. On Windows, the typical maximum size for a stack is 1MB, whereas it is 8MB on a typical modern Linux, although those values are adjustable in various ways.
Default stack size for 64bit processes is 4MB, it is 1MB for 32bit processes. You can modify the main-threads stack size by changing the value in its PE header. You can also specify the stack size by using the right overload of the Thread constructor.
Thus, the only real limitation on stack size is available memory on 64 bit systems (address space fragmentation is rather theoretical with a 16 zebibyte address space size). That is the stack for the first thread only. New threads have to allocate new stacks and are limited because they will run into other objects.
Stack Overflow Error in Programming Due to that, whenever the stack memory gets completely filled, a stack overflow error occurs.
You're asking the wrong question. If stack size matters, you're doing something wrong.
If you use many datapoints, you'll put them in a collection, such as an array. Arrays are always allocated on then heap. An array of structs embeds the individual structs and forms a continuous memory block. (If you have more than 2GB, you need several arrays).
Whereas with reference types, the array will only contain the references, and the objects get allocated individually on the heap. A heap allocation has about 16 bytes of overhead, the reference in the array accounts for another 8.
You'll also get worse cache locality due to the indirections, and the GC has to do more work, to crawl all those references.
My conclusion is that if you have many small datapoints, make them a struct, and put them in an array.
You are going to store your data in arrays and arrays are always stored on the heap. So it doesn't matter whether or not you use structs or classes to hold those arrays. You may well want to make sure that your data points are value types (i.e. structs) so that arrays of data points can be allocated efficiently in contiguous blocks of memory.
Performance differences between heap and stack allocated memory are most likely to be seen with small objects that are allocated and deallocated in a short space of time. For long-lived objects of the size you describe, I would expect there to be no difference in performance between stack and heap allocated memory.
You could use classes for your data points. In this case, the memory will be allocated on the heap.
But considering that you are talking about 500 million data points, and especially since you are programming in the .NET world with a more restricted memory limit for apps, I would strongly encourage using some kind of embedded database, like sqlite, for example. In this way, you would avoid having all of your data points in memory simultaneously, but only the ones you need for computation now.
It's surprising that no one seemed to try to answer the actual question.
I absolutely understand that this is the wrong question to ask 99.9% of the time, but it would still be interesting to know the results (at least I was curious).
It is really simple using unsafe code and the stackalloc
keyword.
class Program
{
static void Main(string[] args)
{
for (int i = 100; i < Int32.MaxValue; i+=10)
{
StackCheck(i);
Console.WriteLine($"Successfully allocated {i} bytes on the stack");
}
}
public static unsafe void StackCheck(int size)
{
byte* array = stackalloc byte[size];
}
}
Mind that this is 100% implementation detail and may differ by CLR, CLR version, operating system or individual machine. In my experiment both the full .NET Framework 4.7.2 and .NET Core 2.1.4 crashed just above the 1MB mark. Interestingly it is not even consistent between runs, the results fluctuate by a few hundred bytes.
You cannot change the stack size on an existing thread, but you can set it on new ones:
Thread testThread = new Thread(() =>
{
for (int i = 1000; i < Int32.MaxValue; i+=1000)
{
StackCheck(i);
Console.WriteLine($"Successfully allocated {i} bytes on the stack");
}
}, 200_000_000);
testThread.Start();
testThread.Join();
Obviously the whole stack is allocated when you create the thread, if you set it too large, the Thread
constructor will throw an OutOfMemoryException
.
But again, this test was done to mainly to satisfy my own curiosity, as others stated don't do this unless you really really know what you're doing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With