Modify and replay MongoDB oplog

Tags:

mongodb

Is it possible to modify the MongoDB oplog and replay it?

A bug caused an update to be applied to more documents than it was supposed to be, overwriting some data. Data was recovered from backup and reintegrated, so nothing was actually lost, but I was wondering if there was a way to modify the oplog to remove or modify the offending update and replay it.

I don't have in depth knowledge of MongoDB internals, so informative answers along the lines of, "you don't understand how it works, it's like this" will also be considered for acceptance.

409

asked Mar 16 '13 02:03

michaeltwofish

1 Answers

One of the big issues in application or human error data corruption is that the offending write to the primary will immediately be replicated to the secondary.

This is one of the reasons that users take advantage of "slaveDelay" - an option to run one of your secondary nodes with a fixed time delay (of course that only helps you if you discover the error or bug during the time period that's shorter than the delay on that secondary).

In case you don't have such a set-up, you have to rely on a backup to recreate the state of the records you need to restore to their pre-bug state.

Perform all the operations on a separate stand-alone copy of your data - only after verifying that everything was properly recreated should you then move the corrected data into your production system.

What is required to be able to do this is a recent copy of the backup (let's say the backup is X hours old) and the oplog on your cluster must hold more than X hours worth of data. I didn't specify which node's oplog because (a) every member of the replica set has the same contents in the oplog and (b) it is possible that your oplog size is different on different node members, in which case you want to check the "largest" one.

So let's say your most recent backup is 52 hours old, but luckily you have an oplog that holds 75 hours worth of data (yay).

You already realized that all of your nodes (primary and secondaries) have the "bad" data, so what you would do is restore this most recent backup into a new mongod. This is where you will restore these records to what they were right before the offending update - and then you can just move them into the current primary from where they will get replicated to all the secondaries.

While restoring your backup, create a mongodump of your oplog collection via this command:

mongodump -d local -c oplog.rs -o oplogD

Move the oplog to its own directory renaming it to oplog.bson:

mkdir oplogR mv oplogD/local/oplog.rs.bson oplogR/oplog.bson

Now you need to find the "offending" operation. You can dump out the oplog to human readable form, using the bsondump command on oplogR/oplog.bson file (and then use grep or what-not to find the "bad" update). Alternatively you can query against the original oplog in the replica set via use local and db.oplog.rs.find() commands in the shell.

Your goal is to find this entry and note its ts field.

It might look like this:

"ts" : Timestamp( 1361497305, 2789 )

Note that the mongorestore command has two options, one called --oplogReplay and the other called oplogLimit. You will now replay this oplog on the restored stand-alone server BUT you will stop before this offending update operation.

The command would be (host and port are where your newly restored backup is):

mongorestore -h host --port NNNN --oplogReplay --oplogLimit 1361497305:2789 oplogR

This will restore each operation from the oplog.bson file in oplogR directory stopping right before the entry with ts value Timestamp(1361497305, 2789).

Recall that the reason you were doing this on a separate instance is so you can verify the restore and replay created correct data - once you have verified it then you can write the restored records to the appropriate place in the real primary (and allow replication propagate the corrected records to the secondaries).

answered Sep 18 '22 10:09

Asya Kamsky

Related questions
                            
                                How to get mongo command results in to a flat file
                            
                                Mongodb Join on _id field from String to ObjectId
                            
                                How to listen only to localhost on MongoDB
                            
                                Mongoose - Save array of strings
                            
                                Difference between createIndex() and ensureIndex() in java using mongodb
                            
                                Mongo $in operator performance
                            
                                MongoDB on EC2 server or AWS SimpleDB?
                            
                                mongodb status of index creation job
                            
                                Using S3 as a database vs. database (e.g. MongoDB)
                            
                                How do I describe a collection in Mongo?
                            
                                How to check if a collection exists in Mongodb native nodejs driver?
                            
                                How do I query referenced objects in MongoDB?
                            
                                Delete every non utf-8 symbols from string
                            
                                MongoDB rename database field within array
                            
                                Mongoose Unique values in nested array of objects
                            
                                How to get updated document back from the findOneAndUpdate method?
                            
                                How do you update multiple field using Update.Set in MongoDB using official c# driver?
                            
                                What does it mean to install MongoDb as a service?
                            
                                limits of number of collections in databases
                            
                                Get BinData UUID from Mongo as string

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With