I'm looking at solutions to store a massive quantity of information consuming the less possible disk space.
The information structure is very simple and the queries will also be very simple. I've looked at solutions like Apache Cassandra and relations databases but couldn't find a comparison where disk usage is mentioned.
Any ideas on this would be great.
The total bytes of disk space for the database is n-pages-of-data multiplied by the page-size. In the example, the result is 1324204032 bytes, or roughly 1.23GB. 1324204032 = 161646 * 8192.
A database snapshot does not reserve space when it is created so it is quite possible for it to run out of space (and hence become unusable). Database snapshots use NTFS sparse files—you can have an arbitrarily large file that takes up very minimal space on disk.
Speaking about Apache Cassandra - it's just a disk space hog. 200 MB of logs resulted in 1.2 GB files produced by Cassandra - and the keyspace was just 4 columns with 200 length strings.
Take a look at Oracle Berkeley DB - very simple robust database (key/value):
"Berkeley DB enables the development of custom data management solutions, without the overhead traditionally associated with such custom projects. Berkeley DB provides a collection of well-proven building-block technologies that can be configured to address any application need from the handheld device to the datacenter, from a local storage solution to a world-wide distributed one, from kilobytes to petabytes."
Redis might worth a check if you can store your data in key-value
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With