I have an excel sheet with a million rows. Each row has 100 columns. Each row represents an instance of a class with 100 attributes, and the columns values are the values for these attributes.
What data structure is the most optimal for use here, to store the million instance of data?
Thanks
It really depends on how you need to access this data and what you want to optimize for – like, space vs. speed.
One million rows with 100 values where is each value uses 8 bytes of memory is only 800 MB which will easily fit into the memory of most PC esp if they are 64-bit. Try to make the type of each column as compact as possible.
A more efficient way of storing the data is by column. i.e. you have array for each column with a primitive data type. I suspect you don't even need to do this.
If you have many more rows e.g. billions, you can use off heap memory i.e. memory mapped files and direct memory. This can efficient store more data than you have main memory while keeping you heap relatively small. (e.g. 100s of GB off-heap with 1 GB in heap)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With