Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch for spark 3.0

Im getting issues while using spark3.0 for reading elastic. My elasticsearch version 7.6.0 I used elastic jar of the same version. Please suggest a solution.

like image 288
JUGAL KISHORE Avatar asked Jul 24 '20 05:07

JUGAL KISHORE


2 Answers

Spark 3.0.0 relies on Scala 2.12, which is not yet supported by Elasticsearch-hadoop. This and a few further issues prevent us using Spark 3.0.0 together with Elasticsearch. If you want to compile it yourself, there is a pull-request on elasticsearch-hadoop (https://github.com/elastic/elasticsearch-hadoop/pull/1308) which should at least allow using scala 2.12. Not sure if it will fix the other issues as well.

like image 103
Simon Andermatt Avatar answered Sep 19 '22 11:09

Simon Andermatt


It is not official for now, but you can compile the dependency on https://github.com/elastic/elasticsearch, the steps are

  1. git clone https://github.com/elastic/elasticsearch.git
  2. cd elasticsearch-hadoop/
  3. vim ~/.bashrc
  4. export JAVA8_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
  5. source ~/.bashrc
  6. ./gradlew elasticsearch-spark-30:distribution --console=plain

and finally you can find .jar package in folder: "elasticsearch-hadoop\spark\sql-30\build\distributions", elasticsearch-spark-30_2.12-8.0.0-SNAPSHOT.jar is the es packages

like image 36
Ray Avatar answered Sep 16 '22 11:09

Ray