Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set YARN queue for spark-shell?

I'm executing some spark(scala) sql code in spark shell. I want to know which queue I am using and if possible I want to know how much memory and executors I am using and how to optimize it?

like image 919
user8167344 Avatar asked Dec 29 '18 12:12

user8167344


People also ask

How do you start the Spark shell in YARN mode?

Launching Spark on YARNEnsure that HADOOP_CONF_DIR or YARN_CONF_DIR points to the directory which contains the (client side) configuration files for the Hadoop cluster. These configs are used to write to HDFS and connect to the YARN ResourceManager.

How do I submit a queue in Spark?

Open the Resource Manager UI and confirm the Queues configured. Login to the cluster and submit the job to the spark Queue. In the logs, you can see the output from the spark job. Thus, you are able to run the Spark Jobs in different Queue.


1 Answers

You can set queue name, number of executors, executor memory, number of total cores, cores per executor, driver memory,etc when you start spark shell or spark-submit

here is how you can specify the parameters.

spark-shell --executor-memory 6G --executor-cores 5 --num-executors 20 --driver-memory 2G --queue $queue_name

You should be calculating these parameters as per your cluster capacity according to fat executor or thin executor concept.

If you still want to check resources utilization, you can check resource manager page or SPARK web UI page

like image 170
Rohit Nimmala Avatar answered Oct 05 '22 07:10

Rohit Nimmala