Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Possible big mistake. What exactly does "db.repairDatabase()" do? MONGODB

I have a mongodb database with several million users. I wanted to free space and I created a bot to remove inactive users of more than 6 months.

I have been looking at the disk for several minutes and I have seen that it varied but it will not release large space, not even 1 mb. That's weird.

I've read that "remove" does not actually delete the disc if it does not simply mark that it can be deleted or overwritten. It is true?

That seemed to make a lot of sense to me. So, I've looked for something that forces space to really free up...

I've applied repairDatabase() and I think I've done wrong. Everything has been blocked!

I have tried the luck and I have restarted the server. There is a MongoDB service working but its status is maintained in "Starting" (not Running).

I'm reading from other sites that repairDatabase() requires twice as much space as the original size of the database, it does not have it.

I do not know, what is doing, and this could in several hours, days ...

Is the database lost? I think I will stop all services and delete the database.

like image 687
ephramd Avatar asked Oct 17 '22 04:10

ephramd


1 Answers

repairDatabase is similar to fsck. That is, it attempts to clean up the database of any corrupt documents which may be preventing MongoDB to start up. How it works in detail is different depending on your storage engine, but repairDatabase could potentially remove documents from the database.

The details of what the command does is outlined quite clearly (with all the warnings) in the MongoDB documentation page: https://docs.mongodb.com/manual/reference/command/repairDatabase/

I would suggest that next time it's better to read the official documentation first rather than reading what people said in forums. Second-hand information like these could be outdated, or just plain wrong.

Having said that, you should leave the process running until completion, and perform any troubleshooting if the database cannot be started. It may require 2x the disk space of your data, but it's also possible that the command just needs time to finish.

like image 144
kevinadi Avatar answered Oct 21 '22 00:10

kevinadi