Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there any tools to estimate index size in MongoDB?

I'm looking for a tool to get a decent estimate of how large a MongoDB index will be based on a few signals like:

  • How many documents in my collection
  • The size of the indexed field(s)
  • The size of the _id I'm using if not ObjectId
  • Geo/Non-geo

Has anyone stumbled across something like this? I can imagine it would be extremely useful given Mongo's performance degradation once it hits the memory wall and documents start getting paged out to disk. If I have a functioning database and want to add another index, the only way I'll know if it will be too big is to actually add it.

It wouldn't need to be accurate down to the bit, but with some assumptions about B-Trees and the index implementation I'm sure it could be reasonable enough to be helpful.

If this doesn't exist already I'd like to build and open source it, so if I've missed any required parameters for this calculation please include in your answer.

like image 1000
jpredham Avatar asked Dec 22 '11 17:12

jpredham


People also ask

How does MongoDB determine index size?

Get size of index and data in MongoDB collection with scaling factor. db. user. stats(1024) , is used to get size of data and size of index on user collection in MongoDB.

How do I find the size of a collection in MongoDB?

collection. totalSize() method is used to reports the total size of a collection, including the size of all documents and all indexes on a collection. Returns: The total size in bytes of the data in the collection plus the size of every index on the collection.

What is total index size in MongoDB?

A collection cannot have more than 64 indexes. The length of the index name cannot be longer than 125 characters. A compound index can have maximum 31 fields indexed.

How do you check if an index is being used in MongoDB?

Finding indexes You can find all the available indexes in a MongoDB collection by using the getIndexes method. This will return all the indexes in a specific collection. Result: The output contains the default _id index and the user-created index student name index.


2 Answers

I just spoke with some of the 10gen engineers and there isn't a tool but you can do a back of the envelope calculation that is based on this formula:

2 * [ n * ( 18 bytes overhead + avg size of indexed field + 5 or so bytes of conversion fudge factor ) ]

Where n is the number of documents you have.

The overhead and conversion padding are mongo specific but the 2x comes from the b-tree data structure being roughly half full (but having allocated 100% of the space a full tree would require) in the worst case.

I'd explain more but I'm learning about it myself at the moment. This presentation will have more details: http://www.10gen.com/presentations/mongosp-2011/mongodb-internals

like image 122
Tyler Brock Avatar answered Sep 17 '22 23:09

Tyler Brock


You can check the sizes of the indexes on a collection by using command:

db.collection.stats()

More details here: http://docs.mongodb.org/manual/reference/method/db.collection.stats/#db.collection.stats

like image 37
Minh Nguyen Avatar answered Sep 20 '22 23:09

Minh Nguyen