Need some way to push data from clients database to central database.Basically, there are several instances of MongoDB running on remote machines [clients] , and need some method to periodically update central mongo database with newly added and modified documents in clients.it must replicate its records to the single central server
Eg:
If I have 3 mongo instances running on 3 machines each having data of 10GB then after the data migration 4th machine's mongoDB must have 30GB of data. And cenral mongoDB machine must get periodically updated with data of all those 3 machines. But these 3 machines not only get new documents but existing documents in them may get updated. I would like the central mongoDB machine also to get these updations.
Your desired replication strategy is not formally supported by MongoDB.
A MongoDB replica set consists of a single primary with asynchronous replication to one or more secondary servers in the same replica set. You cannot configure a replica set with multiple primaries or replication to a different replica set.
However, there are a few possible approaches for your use case depending on how actively you want to keep your central server up to date and the volume of data/updates you need to manage.
Merging data from multiple standalone servers can create unexpected conflicts. For example, unique indexes would not know about documents created on other servers.
Ideally the data you are consolidating will still be separated by a unique database name per origin server so you don't have strange crosstalk between disparate documents that happen to have the same namespace and _id
shared by different origin servers.
mongodump
and mongorestore
If you just need to periodically sync content to your central server, one way to do so is using mongodump
and mongorestore
. You can schedule a periodic mongodump
from each of your standalone instances and use mongorestore
to import them into the central server.
There is a --db
parameter for mongorestore
that allows you to restore into a different database from the original name (if needed)
mongorestore
only performs inserts into the existing database (i.e. does not perform updates or upserts). If existing data with the same _id
already exists on the target database, mongorestore will not replace it.
You can use mongodump
options such as --query
to be more selective on data to export (for example, only select recent data rather than all)
If you want to limit the amount of data to dump & restore on each run (for example, only exporting "changed" data), you will need to work out how to handle updates and deletions on the central server.
Given the caveats, the simplest use of this approach would be to do a full dump & restore (i.e. using mongorestore --drop
) to ensure all changes are copied.
oplog
.If you need more realtime or incremental replication, a possible approach is creating tailable cursors on the MongoDB replication oplog
.
This approach is basically "roll your own replication". You would have to write an application which tails the oplog on each of your MongoDB instances and looks for changes of interest to save to your central server. For example, you may only want to replicate changes for selective namespaces (databases or collections).
A related tool that may be of interest is the experimental Mongo Connector from 10gen labs. This is a Python module that provides an interface for tailing the replication oplog
.
You have to implement your own code for this, and learn/understand how to work with the oplog
documents
There may be an alternative product which better supports your desired replication model "out of the box".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With