Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alternatives to Elasticsearch river plugins [closed]

I want to synchronize an Elasticsearch index with the contents of an SQL database. The Elasticsearch JDBC river meets all my requirements, but in the documentation it is said that the plugin is deprecated.

I don't want to use a tool that won't be supported in the following years. What are the alternatives?

In the documentation of the river, it is said:

Note, JDBC plugin is not only a river, but also a standalone module. Because Elasticsearch river API is deprecated, this is an important feature.

Why is it an important feature? Does it mean that I can still use it indefinitely despite the deprecation, for example by using a feeder instead of a river?

like image 629
Heschoon Avatar asked Apr 16 '15 12:04

Heschoon


2 Answers

Some alternatives:

  • The rivers can still be used, until the version 2.0 of Elasticsearch. But it's not a long term solution.
  • You can write your own solution, as said plmaheu. It's some work, but will fit your program perfectly and is recommended on the Elasticsearch blog.
  • Instead of writing a log of custom code, you can send the insert/update/delete requests to Logstash, that will make them on Elasticsearch. I like this solution since Logstash will make bulks for you and handle other things that you don't want to implement by yourself.
  • I heard that you can use an ETl tool like Talend, but I didn't investigated that solution since it's a paying solution.
  • There is the gatherer plugin that was supposed to replace the rivers. However it has not been update since last year so it's likely that the project has been abandoned.

The two solutions recommended on the ES blog are writing your solution or using Logstash. Choose the one that fits your requirements.

Note: a lot of great solutions are currently in development to replace the rivers, the logstash-jdbc input as an example. The deprecation of the rivers is quite recent and can expect that many replacements will emerge over the next months/years.

like image 68
Heschoon Avatar answered Nov 09 '22 13:11

Heschoon


You're probably better off writing your own. Rivers don't have that much features and you will very probably need a finer grained control on your data access than what a river would allow you. There are 2 high level components you need:

  • An executable tool fetching data from the SQL server and sending it to ElasticSearch.
  • A scheduler, to make the tool run at the interval you need.
like image 20
plmaheu Avatar answered Nov 09 '22 13:11

plmaheu