In one of my project, I am planning to use ElasticSearch with MySQL. I have successfully installed ElasticSearch. I am able to manage index in ES separately. but I don't know how to implement the same with MySQL.
I have read a couple of documents but I am a bit confused and not having a clear idea.
To use ElasticSearch with Mysql you will require The Java Database Connection (JDBC) importer. with JDBC drivers you can sync your mysql data into elasticsearch. This installation procedure will install Elasticsearch in /usr/share/elasticsearch/ whose configuration files will be placed in /etc/elasticsearch .
There are two ways to connect to your Elasticsearch cluster: Through the RESTful API or through the Java transport client. Both ways use an endpoint URL that includes a port, such as https://ec47fc4d2c53414e1307e85726d4b9bb.us-east-1.aws.found.io:9243 .
Elasticsearch is a standalone database. Its main use case is for searching text and text and/number related queries such as aggregations. Generally, it's not recommended to use Elasticsearch as the main database, as some operations such as indexing (inserting values) are more expensive compared to other databases.
As of ES 5.x , they have given this feature out of the box with logstash plugin.
This will periodically import data from database and push to ES server.
One has to create a simple import file given below (which is also described here) and use logstash to run the script. Logstash supports running this script on a schedule.
# file: contacts-index-logstash.conf
input {
jdbc {
jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
jdbc_user => "user"
jdbc_password => "pswd"
schedule => "* * * * *"
jdbc_validate_connection => true
jdbc_driver_library => "/path/to/latest/mysql-connector-java-jar"
jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
statement => "SELECT * from contacts where updatedAt > :sql_last_value"
}
}
output {
elasticsearch {
protocol => http
index => "contacts"
document_type => "contact"
document_id => "%{id}"
host => "ES_NODE_HOST"
}
}
# "* * * * *" -> run every minute
# sql_last_value is a built in parameter whose value is set to Thursday, 1 January 1970,
# or 0 if use_column_value is true and tracking_column is set
You can download the mysql jar from maven here.
In case indexes do not exist in ES when this script is executed, they will be created automatically. Just like a normal post call to elasticsearch
Finally i was able to find the answer. sharing my findings.
To use ElasticSearch with Mysql you will require The Java Database Connection (JDBC) importer. with JDBC drivers you can sync your mysql data into elasticsearch.
I am using ubuntu 14.04 LTS and you will require to install Java8 to run elasticsearch as it is written in Java
following are steps to install ElasticSearch 2.2.0 and ElasticSearch-jdbc 2.2.0 and please note both the versions has to be same
after installing Java8 ..... install elasticsearch 2.2.0 as follows
# cd /opt
# wget https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/deb/elasticsearch/2.2.0/elasticsearch-2.2.0.deb
# sudo dpkg -i elasticsearch-2.2.0.deb
This installation procedure will install Elasticsearch in /usr/share/elasticsearch/ whose configuration files will be placed in /etc/elasticsearch .
Now lets do some basic configuration in config file. here /etc/elasticsearch/elasticsearch.yml is our config file you can open file to change by
nano /etc/elasticsearch/elasticsearch.yml
and change cluster name and node name
For example :
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: servercluster
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: vps.server.com
#
# Add custom attributes to the node:
#
# node.rack: r1
Now save the file and start elasticsearch
/etc/init.d/elasticsearch start
to test ES installed or not run following
curl -XGET 'http://localhost:9200/?pretty'
If you get following then your elasticsearch is installed now :)
{
"name" : "vps.server.com",
"cluster_name" : "servercluster",
"version" : {
"number" : "2.2.0",
"build_hash" : "8ff36d139e16f8720f2947ef62c8167a888992fe",
"build_timestamp" : "2016-01-27T13:32:39Z",
"build_snapshot" : false,
"lucene_version" : "5.4.1"
},
"tagline" : "You Know, for Search"
}
Now let's install elasticsearch-JDBC
download it from http://xbib.org/repository/org/xbib/elasticsearch/importer/elasticsearch-jdbc/2.3.3.1/elasticsearch-jdbc-2.3.3.1-dist.zip
and extract the same in /etc/elasticsearch/ and create "logs" folder also there ( path of logs should be /etc/elasticsearch/logs)
I have one database created in mysql having name "ElasticSearchDatabase" and inside that table named "test" with fields id,name and email
cd /etc/elasticsearch
and run following
echo '{
"type":"jdbc",
"jdbc":{
"url":"jdbc:mysql://localhost:3306/ElasticSearchDatabase",
"user":"root",
"password":"",
"sql":"SELECT id as _id, id, name,email FROM test",
"index":"users",
"type":"users",
"autocommit":"true",
"metrics": {
"enabled" : true
},
"elasticsearch" : {
"cluster" : "servercluster",
"host" : "localhost",
"port" : 9300
}
}
}' | java -cp "/etc/elasticsearch/elasticsearch-jdbc-2.2.0.0/lib/*" -"Dlog4j.configurationFile=file:////etc/elasticsearch/elasticsearch-jdbc-2.2.0.0/bin/log4j2.xml" "org.xbib.tools.Runner" "org.xbib.tools.JDBCImporter"
now check if mysql data imported in ES or not
curl -XGET http://localhost:9200/users/_search/?pretty
If all goes well, you will be able to see all your mysql data in json format and if any error is there you will be able to see them in /etc/elasticsearch/logs/jdbc.log file
Caution :
In older versions of ES ... plugin Elasticsearch-river-jdbc was used which is completely deprecated in latest version so do not use it.
I hope i could save your time :)
Any further thoughts are appreciated
Reference url : https://github.com/jprante/elasticsearch-jdbc
The logstash JDBC plugin will do the job:
input {
jdbc {
jdbc_connection_string => "jdbc:mysql://localhost:3306/testdb"
jdbc_user => "root"
jdbc_password => "factweavers"
# The path to our downloaded jdbc driver
jdbc_driver_library => "/home/comp/Downloads/mysql-connector-java-5.1.38.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
# our query
schedule => "* * * *"
statement => "SELECT" * FROM testtable where Date > :sql_last_value order by Date"
use_column_value => true
tracking_column => Date
}
output {
stdout { codec => json_lines }
elasticsearch {
"hosts" => "localhost:9200"
"index" => "test-migrate"
"document_type" => "data"
"document_id" => "%{personid}"
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With