Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB 4.x Real Time Sync to ElasticSearch 6.x +

I'm trying to find an easy way to sync data in mongoDB 4.x, to elasticsearch 6.x . My use case is for partial text search that is supported by elasticsearch but no supported by mongodb. MongoDB is the primary database for my applications.

All solutions i found seem outdated and only support older version of mongoDB / elasticsearch. These include mongodb-connector, mongodb river

What is the best tool to use so that any changes (CRUD) to data in mongoDB is automatically synced to elasticsearch?

like image 472
user1955934 Avatar asked Dec 13 '22 11:12

user1955934


1 Answers

if you work with docker you can get this tutorial

https://github.com/ziedtuihri/Monstache_Elasticsearch_Mongodb

Monstache is a sync daemon written in Go that continuously indexes your MongoDB collections into Elasticsearch. Monstache gives you the ability to use Elasticsearch to do complex searches and aggregations of your MongoDB data and easily build realtime Kibana visualizations and dashboards. documentation for Monstache :
https://rwynn.github.io/monstache-site/
github :
https://github.com/rwynn/monstache

docker-compose.yml

version: '2.3'
networks:
  test:
    driver: bridge

services:
  db:
    image: mongo:3.0.2
    expose:
      - "27017"
    container_name: mongodb
    volumes:
      - ./mongodb:/data/db
      - ./mongodb_config:/data/configdb
    ports:
      - "27018:27017"
    command: mongod --smallfiles --replSet rs0
    networks:
      - test

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:6.8.7
    container_name: elasticsearch
    volumes:
      - ./elastic:/usr/share/elasticsearch/data
      - ./elastic/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
    ports:
      - 9200:9200
    command: elasticsearch -Enetwork.host=_local_,_site_ -Enetwork.publish_host=_local_
    healthcheck:
      test: "wget -q -O - http://localhost:9200/_cat/health"
      interval: 1s
      timeout: 30s
      retries: 300
    ulimits:
      nproc: 65536
      nofile:
        soft: 65536
        hard: 65536
      memlock:
        soft: -1
        hard: -1
    networks:
      - test

  monstache:
    image: rwynn/monstache:rel4
    expose:
      - "8080"
    ports:
      - "8080:8080"
    container_name: monstache
    command: -mongo-url=mongodb://db:27017 -elasticsearch-url=http://elasticsearch:9200 -direct-read-namespace=Product_DB.Product -direct-read-split-max=2
    links:
      - elasticsearch
      - db
    depends_on:
      db:
        condition: service_started
      elasticsearch:
        condition: service_healthy
    networks:
      - test

replicaset.sh

#!/bin/bash

# this configuration is so important 
echo "Starting replica set initialize"
until mongo --host 192.168.144.2 --eval "print(\"waited for connection\")"
do
    sleep 2
done
echo "Connection finished"
echo "Creating replica set"
mongo --host 192.168.144.2 <<EOF
rs.initiate(
  {
    _id : 'rs0',
    members: [
      { _id : 0, host : "db:27017", priority : 1 }
    ]
  }
)
EOF
echo "replica set created"

1) run this commande en terminal $ sysctl -w vm.max_map_count=262144

if you work on a server i don't know if is necessary

2)run en terminal docker-compose build

3) run en terminal $ docker-compose up -d

don't down your container.

$ docker ps

copy the Ipadress of mongo db image

$ docker inspect id_of_mongo_image

copy the IPAddress and set it in replicaset.sh and run replicaset.sh

$ ./replicaset.sh

on terminal you shoulf see => replica set created

$ docker-compose down

4)run en terminal $ docker-compose up

finally .......

Replication in MongoDB

A replica set is a group of mongod instances that maintain the same data set. A replica set contains several data bearing nodes and optionally one arbiter node. Of the data bearing nodes, one and only one member is deemed the primary node, while the other nodes are deemed secondary nodes.
The primary node receives all write operations. A replica set can have only one primary capable of confirming writes with { w: "majority" } write concern; although in some circumstances, another mongod instance may transiently believe itself to also be primary.
View the replica set configuration. Use rs.conf()

replica set allow you to indexes your MongoDB collections into Elasticsearch en real time synchronization.

like image 180
Zied Touahri Avatar answered Dec 28 '22 06:12

Zied Touahri