I have an application that use mongo for storing short living data. All data older than 45 minutes is removed by script something like:
oldSearches = [list of old searches]
connection = Connection()
db = connection.searchDB
res = db.results.remove{'search_id':{"$in":oldSearches}})
I've checked current status -
>db.results.stats()
{
"ns" : "searchDB.results",
"count" : 2865,
"size" : 1003859656,
"storageSize" : 29315124464,
"nindexes" : 1,
"ok" : 1
}
So, according to this 1gb of data occupies 29GB of storage. Data folder looks like this(You may see that many files are very old - last accessed in the middle of may):
ls -l /var/lib/mongodb/
total 31506556
-rwxr-xr-x 1 mongodb nogroup 6 2011-06-05 18:28 mongod.lock
-rw------- 1 mongodb nogroup 67108864 2011-05-13 17:45 searchDB.0
-rw------- 1 mongodb nogroup 134217728 2011-05-13 14:45 searchDB.1
-rw------- 1 mongodb nogroup 2146435072 2011-05-20 20:45 searchDB.10
-rw------- 1 mongodb nogroup 2146435072 2011-05-28 00:00 searchDB.11
-rw------- 1 mongodb nogroup 2146435072 2011-05-27 13:45 searchDB.12
-rw------- 1 mongodb nogroup 2146435072 2011-05-29 16:45 searchDB.13
-rw------- 1 mongodb nogroup 2146435072 2011-06-07 13:50 searchDB.14
-rw------- 1 mongodb nogroup 2146435072 2011-06-06 01:45 searchDB.15
-rw------- 1 mongodb nogroup 2146435072 2011-06-07 13:50 searchDB.16
-rw------- 1 mongodb nogroup 2146435072 2011-06-07 13:50 searchDB.17
-rw------- 1 mongodb nogroup 2146435072 2011-06-06 09:07 searchDB.18
-rw------- 1 mongodb nogroup 268435456 2011-05-13 14:45 searchDB.2
-rw------- 1 mongodb nogroup 536870912 2011-05-11 00:45 searchDB.3
-rw------- 1 mongodb nogroup 1073741824 2011-05-29 23:37 searchDB.4
-rw------- 1 mongodb nogroup 2146435072 2011-05-13 17:45 searchDB.5
-rw------- 1 mongodb nogroup 2146435072 2011-05-18 17:45 searchDB.6
-rw------- 1 mongodb nogroup 2146435072 2011-05-16 01:45 searchDB.7
-rw------- 1 mongodb nogroup 2146435072 2011-05-17 13:45 searchDB.8
-rw------- 1 mongodb nogroup 2146435072 2011-05-23 16:45 searchDB.9
-rw------- 1 mongodb nogroup 16777216 2011-06-07 13:50 searchDB.ns
-rw------- 1 mongodb nogroup 67108864 2011-04-23 18:51 test.0
-rw------- 1 mongodb nogroup 16777216 2011-04-23 18:51 test.ns
According to "top" mongod uses 29G of virtual memory ( and 780Mb of RSS)
Why do i have such abnormal values? Do i need to run something additional to .remove() function to clean up database from old values?
fileSize is larger than storageSize because it includes index extents and yet-unused space in data files. While fileSize does decrease when you delete a database, fileSize does not decrease as you remove collections, documents or indexes.
The maximum size an individual document can be in MongoDB is 16MB with a nested depth of 100 levels. Edit: There is no max size for an individual MongoDB database.
Large objects, or "files", are easily stored in MongoDB. It is no problem to store 100MB videos in the database. This has a number of advantages over files stored in a file system. Unlike a file system, the database will have no problem dealing with millions of objects.
Virtual memory size and resident size will appear to be very large for the mongod process. This is benign: virtual memory space will be just larger than the size of the datafiles open and mapped; resident size will vary depending on the amount of memory not used by other processes on the machine.
http://www.mongodb.org/display/DOCS/Caching
When you remove an object from MongoDB collection, the space it occupied is not automatically garbage collected and new records are only appended to the end of data files, making them grow bigger and bigger. This explains it all:
http://www.mongodb.org/display/DOCS/Excessive+Disk+Space
For starters, simply use:
db.repairDatabase()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With