I'm trying to find an easy way to sync data in mongoDB 4.x, to elasticsearch 6.x . My use case is for partial text search that is supported by elasticsearch but no supported by mongodb. MongoDB is the primary database for my applications.
All solutions i found seem outdated and only support older version of mongoDB / elasticsearch. These include mongodb-connector, mongodb river
What is the best tool to use so that any changes (CRUD) to data in mongoDB is automatically synced to elasticsearch?
if you work with docker you can get this tutorial
https://github.com/ziedtuihri/Monstache_Elasticsearch_Mongodb
Monstache is a sync daemon written in Go that continuously indexes your MongoDB collections into Elasticsearch. Monstache gives you the ability to use Elasticsearch to do complex searches and aggregations of your MongoDB data and easily build realtime Kibana visualizations and dashboards.
documentation for Monstache :
https://rwynn.github.io/monstache-site/
github :
https://github.com/rwynn/monstache
docker-compose.yml
version: '2.3'
networks:
test:
driver: bridge
services:
db:
image: mongo:3.0.2
expose:
- "27017"
container_name: mongodb
volumes:
- ./mongodb:/data/db
- ./mongodb_config:/data/configdb
ports:
- "27018:27017"
command: mongod --smallfiles --replSet rs0
networks:
- test
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:6.8.7
container_name: elasticsearch
volumes:
- ./elastic:/usr/share/elasticsearch/data
- ./elastic/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
ports:
- 9200:9200
command: elasticsearch -Enetwork.host=_local_,_site_ -Enetwork.publish_host=_local_
healthcheck:
test: "wget -q -O - http://localhost:9200/_cat/health"
interval: 1s
timeout: 30s
retries: 300
ulimits:
nproc: 65536
nofile:
soft: 65536
hard: 65536
memlock:
soft: -1
hard: -1
networks:
- test
monstache:
image: rwynn/monstache:rel4
expose:
- "8080"
ports:
- "8080:8080"
container_name: monstache
command: -mongo-url=mongodb://db:27017 -elasticsearch-url=http://elasticsearch:9200 -direct-read-namespace=Product_DB.Product -direct-read-split-max=2
links:
- elasticsearch
- db
depends_on:
db:
condition: service_started
elasticsearch:
condition: service_healthy
networks:
- test
replicaset.sh
#!/bin/bash
# this configuration is so important
echo "Starting replica set initialize"
until mongo --host 192.168.144.2 --eval "print(\"waited for connection\")"
do
sleep 2
done
echo "Connection finished"
echo "Creating replica set"
mongo --host 192.168.144.2 <<EOF
rs.initiate(
{
_id : 'rs0',
members: [
{ _id : 0, host : "db:27017", priority : 1 }
]
}
)
EOF
echo "replica set created"
1) run this commande en terminal $ sysctl -w vm.max_map_count=262144
if you work on a server i don't know if is necessary
2)run en terminal docker-compose build
3) run en terminal $ docker-compose up -d
don't down your container.
$ docker ps
copy the Ipadress of mongo db image
$ docker inspect id_of_mongo_image
copy the IPAddress and set it in replicaset.sh and run replicaset.sh
$ ./replicaset.sh
on terminal you shoulf see => replica set created
$ docker-compose down
4)run en terminal $ docker-compose up
finally .......
A replica set is a group of mongod instances that maintain the same data set. A replica set contains several data bearing nodes and optionally one arbiter node. Of the data bearing nodes, one and only one member is deemed the primary node, while the other nodes are deemed secondary nodes.
The primary node receives all write operations. A replica set can have only one primary capable of confirming writes with { w: "majority" } write concern; although in some circumstances, another mongod instance may transiently believe itself to also be primary.
View the replica set configuration.
Use rs.conf()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With