Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark app unable to write to elasticsearch cluster running in docker

I have a elasticsearch docker image listening on 127.0.0.1:9200, I tested it using sense and kibana, It works fine, I am able to index and query documents. Now when I try to write to it from a spark App

val sparkConf = new SparkConf().setAppName("ES").setMaster("local")
sparkConf.set("es.index.auto.create", "true")
sparkConf.set("es.nodes", "127.0.0.1")
sparkConf.set("es.port", "9200")
sparkConf.set("es.resource", "spark/docs")


val sc = new SparkContext(sparkConf)
val sqlContext = new SQLContext(sc)
val numbers = Map("one" -> 1, "two" -> 2, "three" -> 3)
val airports = Map("arrival" -> "Otopeni", "SFO" -> "San Fran")
val rdd = sc.parallelize(Seq(numbers, airports))

rdd.saveToEs("spark/docs")

It fails to connect, and keeps on retrying

16/07/11 17:20:07 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out 16/07/11 17:20:07 INFO HttpMethodDirector: Retrying request

I tried using IPAddress given by docker inspect for the elasticsearch image, that also does not work. However when I use a native installation of elasticsearch, the Spark App runs fine. Any ideas?

like image 532
khrist safalhai Avatar asked Oct 19 '22 05:10

khrist safalhai


1 Answers

Also, set the config

es.nodes.wan.only to true

As mentioned in this answer if you are having issues writing to ES.

like image 97
bp2010 Avatar answered Oct 21 '22 05:10

bp2010