Getting connection error while reading data from ElasticSearch using apache Spark & Scala

Question

I gave the following code

val conf = new org.apache.spark.SparkConf()
      .setMaster("local[*]")
      .setAppName("es-example")
      .set("es.nodes", "search-2meoihmu.us-est-1.es.amazonaws.com")

val sc = new org.apache.spark.SparkContext(conf)
val resource = "index/data"
val count = sc.esRDD(resource).count()
println(count)

using,

elastic search version=1.5.2
spark version=1.5.2
Scala version=2.10.4

and given library dependency as follows,

libraryDependencies += "org.elasticsearch" % "elasticsearch-spark_2.10" % "2.1.3"

I am getting following error while running the program

Exception in thread "main" org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed

How can I read data from elastic search using spark and Scala?

Jane Wayne · Accepted Answer

Please look at the option "es.nodes.wan.only". By default, the value for this key is set to "false", and when I set it to true, that exception went away. Here is the current documentation for the configuration values: https://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html.

val conf = new org.apache.spark.SparkConf()
 .setMaster("local[*]")
 .setAppName("es-example")
 .set("es.nodes", "search-2meoihmu.us-est-1.es.amazonaws.com")
 .set("es.nodes.wan.only", "true")

Note that the doc specifies to flip this value to true for environments like those on AWS, but this exception happened for me when attempting to point to a VM with Elasticsearch running.

Getting connection error while reading data from ElasticSearch using apache Spark & Scala

Tags:

scala

apache-spark

Devi

1 Answers

Jane Wayne

Recent Activity

Donate For Us

Getting connection error while reading data from ElasticSearch using apache Spark & Scala

Tags:

scala

apache-spark

Devi

1 Answers

Jane Wayne

Related questions

Recent Activity

Donate For Us