On my mongo database, I have one collection capped at 5GB, one at 10MB, and few non-capped ones. None of non-capped ones contains more than 20 small documents.
After long (4h) stress test, which only writes to 5GB capped collection, my database uses 18GBs.
This is what my db.stats says (values in MBs):
data-db:PRIMARY> db.stats(1024*1024)
{
"db" : "data",
"collections" : 9,
"objects" : 8723395,
"avgObjSize" : 208.8405255064112,
"dataSize" : 1737,
"storageSize" : 5130,
"numExtents" : 12,
"indexes" : 19,
"indexSize" : 2534,
"fileSize" : 18423,
"nsSizeMB" : 16,
"ok" : 1
}
And this is 5GB collection stats (values in MBs):
data-db:PRIMARY> db.sms_message_event.stats(1024*1024)
{
"ns" : "data.sms_message_event",
"count" : 8723300,
"size" : 1737,
"avgObjSize" : 0.00019912189194456226,
"storageSize" : 5120,
"numExtents" : 3,
"nindexes" : 6,
"lastExtentSize" : 1026,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 2534,
"indexSizes" : {
"_id_" : 395,
"t_1_when_-1" : 475,
"smsc_message_id_1" : 185,
"user_id_1_t_1_when_1" : 481,
"message_id_1" : 318,
"virtual_number_recipient_when_index" : 678
},
"capped" : true,
"max" : 2147483647,
"ok" : 1
}
So why is fileSize so much bigger than storageSize? I can't even run repairDatabase() now, but I tried compact() on each non-capped collection, with no result. Actually, this was expected as db was clean before the stress test. I mean files were deleted, not only collections dropped.
From logs I can see additional data files were created during stress test, in ~1h intervals.
Some logs: http://pastie.org/private/t8u9caxstafbjdybgwtsfw
UPDATE: After another night, and another pass of 4h stress tests, it's 28GBs :(
data-db:PRIMARY> db.stats(1024*1024)
{
"db" : "data",
"collections" : 9,
"objects" : 8724995,
"avgObjSize" : 208.840894006243,
"dataSize" : 1737,
"storageSize" : 5130,
"numExtents" : 12,
"indexes" : 19,
"indexSize" : 2590,
"fileSize" : 28658,
"nsSizeMB" : 16,
"ok" : 1
}
This is happening because of a bug in MongoDB when re-using space allocated for capped collections. It's been filed as SERVER-9489 and will be triaged and hopefully fixed soon.
The way you can continue running your stress tests without running out of disk space is by deleting the test DB directory after the test finishes, and then creating a new one when you run the new test (this assumes you don't need to reuse the same data). If you do need the same data you can use mongodump to preserve it from run to run though there may be other simpler options that depend on your exact usage.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With