Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can MongoDB dataSize be larger than storageSize?

Tags:

mongodb

As far as I understand, the storage size for MongoDB should always be larger than data size. However, after upgrading to Mongo 3.0 and using WiredTiger, I start seeing that the data size is larger than the storage size.

Here's from one of the databases:

{ 
    "db" : "Results", 
    "collections" : NumberInt(1), 
    "objects" : NumberInt(251816), 
    "avgObjSize" : 804.4109548241573, 
    "dataSize" : NumberInt(202563549), 
    "storageSize" : NumberInt(53755904), 
    "numExtents" : NumberInt(0), 
    "indexes" : NumberInt(5), 
    "indexSize" : NumberInt(41013248), 
    "ok" : NumberInt(1)
}

Note that 202563549 > 53755904 by far margin. I am confused how this can be. Is the way to read db.stats() different now in Mongo 3.0?

like image 985
KangarooWest Avatar asked Dec 02 '15 23:12

KangarooWest


People also ask

What is storage size in MongoDB?

Starting in MongoDB 3.4, the default WiredTiger internal cache size is the larger of either: 50% of (RAM - 1 GB), or. 256 MB.

How much RAM does MongoDB need?

MongoDB requires approximately 1 GB of RAM per 100.000 assets. If the system has to start swapping memory to disk, this will have a severely negative impact on performance and should be avoided.

Does MongoDB compress data by default?

With WiredTiger, MongoDB supports compression for all collections and indexes. Compression minimizes storage use at the expense of additional CPU. By default, WiredTiger uses block compression with the snappy compression library for all collections and prefix compression for all indexes.


1 Answers

The storageSize metric is equal to the size (in bytes) of all the data extents in the database. Without compression, this number is larger than dataSize because it includes yet-unused space (in data extents) and space vacated by deleted or moved documents within extents. However, as you are using the WiredTiger storage engine, data is compressed on the disk and is therefore smaller than the dataSize.

like image 109
Alex Avatar answered Nov 15 '22 11:11

Alex