Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I need very big array length(size) in C#

public double[] result = new double[ ??? ];

I am storing results and total number of the results are bigger than the 2,147,483,647 which is max int32.

I tried biginteger, ulong etc. but all of them gave me errors.

How can I extend the size of the array that can store > 50,147,483,647 results (double) inside it?

Thanks...

like image 955
Atik Avatar asked Dec 03 '22 21:12

Atik


2 Answers

An array of 2,147,483,648 doubles will occupy 16GB of memory. For some people, that's not a big deal. I've got servers that won't even bother to hit the page file if I allocate a few of those arrays. Doesn't mean it's a good idea.

When you are dealing with huge amounts of data like that you should be looking to minimize the memory impact of the process. There are several ways to go with this, depending on how you're working with the data.


Sparse Arrays

If your array is sparsely populated - lots of default/empty values with a small percentage of actually valid/useful data - then a sparse array can drastically reduce the memory requirements. You can write various implementations to optimize for different distribution profiles: random distribution, grouped values, arbitrary contiguous groups, etc.

Works fine for any type of contained data, including complex classes. Has some overheads, so can actually be worse than naked arrays when the fill percentage is high. And of course you're still going to be using memory to store your actual data.

Simple Flat File

Store the data on disk, create a read/write FileStream for the file, and enclose that in a wrapper that lets you access the file's contents as if it were an in-memory array. The simplest implementation of this will give you reasonable usefulness for sequential reads from the file. Random reads and writes can slow you down, but you can do some buffering in the background to help mitigate the speed issues.

This approach works for any type that has a static size, including structures that can be copied to/from a range of bytes in the file. Doesn't work for dynamically-sized data like strings.

Complex Flat File

If you need to handle dynamic-size records, sparse data, etc. then you might be able to design a file format that can handle it elegantly. Then again, a database is probably a better option at this point.

Memory Mapped File

Same as the other file options, but using a different mechanism to access the data. See System.IO.MemoryMappedFile for more information on how to use Memory Mapped Files from .NET.

Database Storage

Depending on the nature of the data, storing it in a database might work for you. For a large array of doubles this is unlikely to be a great option however. The overheads of reading/writing data in the database, plus the storage overheads - each row will at least need to have a row identity, probably a BIG_INT (8-byte integer) for a large recordset, doubling the size of the data right off the bat. Add in the overheads for indexing, row storage, etc. and you can very easily multiply the size of your data.

Databases are great for storing and manipulating complicated data. That's what they're for. If you have variable-width data - strings and the like - then a database is probably one of your best options. The flip-side is that they're generally not an optimal solution for working with large amounts of very simple data.


Whichever option you go with, you can create an IList<T>-compatible class that encapsulates your data. This lets you write code that doesn't have any need to know how the data is stored, only what it is.

like image 158
Corey Avatar answered Dec 05 '22 12:12

Corey


BCL arrays cannot do that.
Someone wrote a chunked BigArray<T> class that can.

However, that will not magically create enough memory to store it.

like image 41
SLaks Avatar answered Dec 05 '22 12:12

SLaks