Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does MongoDB reuse deleted space?

Tags:

mongodb

First off, I know about this question:

Auto compact the deleted space in mongodb?

My question is not about shrinking DB file sizes though, but more about the reuse of deleted space. Say I have 100K documents in a collection, I then delete 50K of those. Will Mongo reuse the space within its data file that the deleted documents have freed? Or are they simply "marked" as deleted?

I don't care so much about the actual size of the file on disk, its more about "does it just grow and grow".

like image 436
Kong Avatar asked Nov 15 '12 01:11

Kong


People also ask

How do I free up space in MongoDB?

There are two ways to reclaim disk space in Percona Server for MongoDB (PSMDB): run compact on nodes or resync the node. In this blog, we will see the best practice for reclaiming the fragmented space on a disk in PSMDB using compact. Disk storage is a critical resource for any scalable database system.

Does MongoDB store data in memory?

MongoDB is not an in-memory database. Although it can be configured to run that way. But it makes liberal use of cache, meaning data records kept memory for fast retrieval, as opposed to on disk.

Does MongoDB save to disk?

Documents are stored on disk using block compression to reduce storage usage. Documents are automatically uncompressed in memory when retrieved by the MongoDB server. Each collection & index is stored in a separate file within the storage.

What is the difference between drop and delete in MongoDB?

DELETE is a Data Manipulation Language command, DML command and is used to remove tuples/records from a relation/table. Whereas DROP is a Data Definition Language, DDL command and is used to remove named elements of schema like relations/table, constraints or entire schema.


1 Answers

Update (Mar 2015): As of the 3.0 release, there are multiple storage engines available in MongoDB. This answer applies to the MMAP storage engine (still the default in MongoDB 3.0), the answer for other engines (WiredTiger for example) is quite different and may well be tunable and adjustable. Hence if you are using another engine, please read the relevant docs for that storage engine to determine what your space re-use defaults and options are.

With the MMAP storage engine, when documents are deleted the space left behind is put into a free list. However, to use the space there will need to be similarly sized documents inserted later, and MongoDB will need to find an appropriate space for that document within a certain time frame (once it times out looking at the list, it will just append) otherwise the space re-use is not going to happen very often. This deletion is done within the data files, so there is no disk space reclamation happening here - all of this is done internally within the existing data files.

If you subsequently do a repair, or resync a secondary from scratch, the data files are rewritten and the space on disk will be reclaimed (any padding on docs is also removed). This is where you will see actual space reclamation on-disk. For any other actions (compact included) the on disk usage will not change and may even increase.

With 2.2+ you can now use the collMod command and the usePowersOf2Sizes option to make the re-use of deleted space more likely (note that this is the default in 2.6+). This means that the initial space allocation for a document is a bit less efficient (512 bytes for a 400 byte doc for example) but means that when a new doc is inserted it is more likely to be able to re-use that space. If you are deleting (or growing and hence moving) documents a lot, then this will be more efficient in the long term.

For anyone that is interested, one of the people that wrote a lot of the storage code (Mathias Stearn) has a great presentation about the storage internals, which can be found here

like image 144
Adam Comerford Avatar answered Sep 21 '22 20:09

Adam Comerford