I want to sync my MongoDB
data to ElasticSearch
, I read a lot of posts talking about elasticsearch river plugin and mongo connector, but all of them are deprecated for mongo 4 and elasticsearch 7!
As logstash
is a proprietary software I would like to use it to sync both... Anyone knows how it's possible to do it?
ElasticSearch is capable to handle queries through REST API and this is its advantage over MongoDB. Flat documents can easily be stored and without degrading the performance of the entire database. In addition to this, ElasticSearch is capable to handle data through filters.
Monstache is a sync daemon written in Go that continously indexes your MongoDB collections into Elasticsearch. Monstache gives you the ability to use Elasticsearch to do complex searches and aggregations of your MongoDB data and easily build realtime Kibana visualizations and dashboards.
Now, we can add data in MongoDB and it should be synced to ElasticSearch. Run this program and you should receive a message on your terminal similar to this. Check data entry in MongoDB. Check data in ElasticSearch. Open localhost:9200/users/_search to view the results. And, we have data in both the databases.
Monstache is a sync daemon written in Go that syncs MongoDB collections into Elasticsearch in real-time. It is possible using monstache to index entire MongoDB collections into elasticsearch, and after indexing, monstache will also keep everything synced.
Mongo-Connector is the proprietary tool by MongoDB and a real-time sync system built on Python that allows you to copy the documents from MongoDB to target systems. MongoDB connector creates a pipeline from one MongoDB cluster to target systems like ElasticSearch, Solr. On startup, it connects MongoDB to target systems and copies the data.
LogStash inputs the data from the source, modifies them using filters, and then outputs the data to the destination. As LogStash is the tool from the ELK stack, it has excellent capabilities to connect with ElasticSearch, you can use LogStash to take input from MongoDB by using JDBC connector, and output to ElasticSearch.
You may sync MongoDB and Elasticsearch with Logstash; syncing is, in fact, one of the major applications of Logstash. After installing Logstash, all that you need to do is specify a pipeline for your use case: one or more input sources (MongoDB in your case) and one or more output sinks (Elasticsearch in your case), put as a config file (example follows) inside Logstash's pipeline directory; Logstash takes care of the rest.
Logstash officially provides plugins for a lot of commonly used data sources and sinks; those plugins let you read data from and write data to various sources with just a few configuration. You just need to find the right plugin, install it, and configure it for your scenario. Logstash has an official output plugin for Elasticsearch and its configuration is pretty intuitive. Logstash, however, doesn't provide any input plugin for MongoDB. You need to find a third party one; this one seems to be pretty promising.
In the end, your pipeline may look something like the following:
input {
mongodb {
uri => 'mongodb://10.0.0.30/my-logs?ssl=true'
placeholder_db_dir => '/opt/logstash-mongodb/'
placeholder_db_name => 'logstash_sqlite.db'
collection => 'events_'
batch_size => 5000
}
}
output {
stdout {
codec => rubydebug #outputs the same thing as elasticsearch in stdout to facilitate debugging
}
elasticsearch {
hosts => "localhost:9200"
index => "target_index"
document_type => "document_type"
document_id => "%{id}"
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With