Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why my mongodb fileSize is much bigger than storageSize in db.stats()?

I have a db named log_test1, with only 1 capped collection logs. The max size of capped collection is 512M. After I inserted 200k data, I found the disk usage of the db is 1.6G. With db.stats(), I can see the storageSize is 512M, correct, but my actual fileSize is 1.6G, why did this happen? How can I control the disk size is just my capped collection size plus index size?

> use log_test1
switched to db log_test1
> db.stats()
{
    "db" : "log_test1",
    "collections" : 3,
    "objects" : 200018,
    "avgObjSize" : 615.8577328040476,
    "dataSize" : 123182632,
    "storageSize" : 512008192,
    "numExtents" : 3,
    "indexes" : 8,
    "indexSize" : 71907920,
    "fileSize" : 1610612736,
    "nsSizeMB" : 16,
    "dataFileVersion" : {
        "major" : 4,
        "minor" : 5
    },
    "ok" : 1
}
like image 484
Tyr Avatar asked Dec 12 '13 01:12

Tyr


People also ask

Why are MongoDB data files large in size?

This is probably because MongoDB preallocates data and journal files. In the data directory, MongoDB preallocates data files to a particular size, in part to prevent file system fragmentation.

What is the maximum size of a MongoDB document?

The maximum BSON document size is 16 megabytes. The maximum document size helps ensure that a single document cannot use excessive amount of RAM or, during transmission, excessive amount of bandwidth. To store documents larger than the maximum size, MongoDB provides the GridFS API.


1 Answers

This is probably because MongoDB preallocates data and journal files.


MongoDB 2

In the data directory, MongoDB preallocates data files to a particular size, in part to prevent file system fragmentation. MongoDB names the first data file <databasename>.0, the next <databasename>.1, etc. The first file mongod allocates is 64 megabytes, the next 128 megabytes, and so on, up to 2 gigabytes, at which point all subsequent files are 2 gigabytes. The data files include files with allocated space but that hold no data. mongod may allocate a 1 gigabyte data file that may be 90% empty. For most larger databases, unused allocated space is small compared to the database.

On Unix-like systems, mongod preallocates an additional data file and initializes the disk space to 0. Preallocating data files in the background prevents significant delays when a new database file is next allocated.

You can disable preallocation with the noprealloc run time option. However noprealloc is not intended for use in production environments: only use noprealloc for testing and with small data sets where you frequently drop databases.

MongoDB 3

The data files in your data directory, which is the /data/db directory in default configurations, might be larger than the data set inserted into the database. Consider the following possible causes:

Preallocated data files

MongoDB preallocates its data files to avoid filesystem fragmentation, and because of this, the size of these files do not necessarily reflect the size of your data.

The storage.mmapv1.smallFiles option will reduce the size of these files, which may be useful if you have many small databases on disk.

The oplog

If this mongod is a member of a replica set, the data directory includes the oplog.rs file, which is a preallocated capped collection in the local database.

The default allocation is approximately 5% of disk space on 64-bit installations.

The journal

The data directory contains the journal files, which store write operations on disk before MongoDB applies them to databases.

Empty records

MongoDB maintains lists of empty records in data files as it deletes documents and collections. MongoDB can reuse this space, but will not, by default, return this space to the operating system.


Taken from MongoDB Storage FAQ.

like image 84
Rafa Avatar answered Nov 16 '22 04:11

Rafa