Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate the future database size in Mongo?

Tags:

mongodb

I'm using MongoDB and we are really happy with this DB. But recently our client asked us for the database size in the future.

We know how to calculate this in a typical relational database, but we don't have a long experience in production with this No-SQL database.

Things that we know:

  • db.namecollections.stats() give us important information like, size(documents),avgObjSize(documents), storageSize, totalIndexSize (more here)

With the size and totalIndexSize we can calculate the total size for the collection only, but the big question here is:

  • Why is there a difference between collection size and storageSize???

How can one calculate this, thinking in the future database size?

like image 918
KCOtzen Avatar asked Dec 23 '11 15:12

KCOtzen


People also ask

How do I find the size of a MongoDB database?

Probably the quickest and easiest way to check the size of a MongoDB collection is to use the db. collection. dataSize() method. This method returns the size of the collection in bytes.

How do you calculate database size?

To estimate the size of a database, estimate the size of each table individually and then add the values obtained. The size of a table depends on whether the table has indexes and, if they do, what type of indexes.

What is the future of MongoDB?

MongoDB aims to empower developers to innovate faster by addressing a wider set of use cases, servicing more of the data lifecycle, optimizing for modern architectures, and implementing the most sophisticated levels of data encryption, all within a single integrated developer data platform.

What is storage size in MongoDB?

Starting in MongoDB 3.4, the default WiredTiger internal cache size is the larger of either: 50% of (RAM - 1 GB), or. 256 MB.


1 Answers

MongoDB pads documents a bit so that they can grow a bit without having to be moved to the end of the collection on disk (an expensive operation).

Also, mongo pre-allocates data files by creating a the next one and filling it with zeros before it is needed to boost speed.

You can throw a --noprealloc flag on mongod to prevent that from hapening.

If you want more info you can look here

In regards to your question about calculating disk space 5 years out, if you can figure out an equation for the growth of your data, make some assumptions about what your average document size will be, and how many / what kinds of indexes you will have, you might be able to come up with something.

Having worked for a bank also, my suggestion would be to come up with an an insane upper bound and then quadruple it. Money is cheap inside a bank, calculation mistakes are not.

like image 76
Tyler Brock Avatar answered Oct 20 '22 05:10

Tyler Brock