Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala Spark connect to remote cluster

I wish to connect to a remote cluster and execute a Spark process. So, from what I have read, this is specified in the SparkConf.

 val conf = new SparkConf()
  .setAppName("MyAppName")
  .setMaster("spark://my_ip:7077")

Where my_ip is the IP address of my cluster. Unfortunately, I get connection refused. So, I am guessing some credentials must be added to connect correctly. How would I specify the credentials? It seems it would be done with .set(key, value), but have no leads on this.

like image 949
Alessandro La Corte Avatar asked Apr 26 '17 09:04

Alessandro La Corte


People also ask

How do I connect to spark master?

Connecting an Application to the Cluster To run an application on the Spark cluster, simply pass the spark://IP:PORT URL of the master as to the SparkContext constructor. You can also pass an option --total-executor-cores <numCores> to control the number of cores that spark-shell uses on the cluster.

Can we run spark Shell in cluster mode?

Based on the resource manager, the spark can run in two modes: Local Mode and cluster mode. The way we specify the resource manager is by the way of a command-line option called --master. Local Mode is also known as Spark in-process is the default mode of spark.


1 Answers

There are two things missing:

  • The cluster manager should be set to yarn (setMaster("yarn")) and the deploy-mode to cluster, your current setup is used for Spark standalone. More info here: http://spark.apache.org/docs/latest/configuration.html#application-properties
  • Also, you need to get yarn-site.xml and core-site.xml files from the cluster and put them in HADOOP_CONF_DIR, so that Spark can pick up yarn settings, such as the IP of your master node. More info: https://theckang.github.io/2015/12/31/remote-spark-jobs-on-yarn.html

By the way, this would work if you use spark-submit to submit a job, programatically it's more complex to achieve it and could only use yarn-client mode which is tricky to setup remotely.

like image 75
jamborta Avatar answered Oct 16 '22 06:10

jamborta