What kind of an architecture is needed to store 100 TB data and query it with aggregation? How many nodes? Disk size per node? What can the best practice be?
Every day 240GB will be written but the size will remain same because the same amount data will be deleted.
Or any different thoughts about storing the data and fast group queries?
The maximum size an individual document can be in MongoDB is 16MB with a nested depth of 100 levels. Edit: There is no max size for an individual MongoDB database.
MongoDB stores huge amounts of data in a naturally traversable format, making it a good choice to store, query, and analyze big data.
Working with MongoDB and ElasticSearch is an accurate decision to process millions of records in real-time. These structures and concepts could be applied to larger datasets and will work extremely well too.
Kindly refer to related question,
MongoDB limit storage size?
Quoting from the the top answer:
The "production deployments" page on MongoDB's site may be of interest to you. Lots of presentations listed with infrastructure information. For example:
http://blog.wordnik.com/12-months-with-mongodb says they're storing 3 TB per node.
I highly recommend HBase.
Facebook uses it for its Messages service, which in Nov 2010 was handling 15 billion messages a day.
We tested MongoDB for a large data set but ended up going with HBase and have been happily using it for months now.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With