Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Default number of executors and cores for spark-shell

Tags:

apache-spark

If I run a spark program in spark shell, is it possible that the program can hog the entire hadoop cluster for hours?

usually there is a setting called num-executors and executor-cores.

spark-shell --driver-memory 10G --executor-memory 15G --executor-cores 8

but if they are not specified and I just run "spark-shell"... will it consume the entire cluster? or are there reasonable defaults.

like image 923
Knows Not Much Avatar asked May 10 '16 00:05

Knows Not Much


People also ask

What is the default number of executors in spark?

instances . The maximum number of executors to be used. Its Spark submit option is --max-executors . If it is not set, default is 2.

How many cores does executor Spark have?

The consensus in most Spark tuning guides is that 5 cores per executor is the optimum number of cores in terms of parallel processing.

How do you determine the number of executors and cores in spark?

According to the recommendations which we discussed above: Leave 1 core per node for Hadoop/Yarn daemons => Num cores available per node = 16-1 = 15. So, Total available of cores in cluster = 15 x 10 = 150. Number of available executors = (total cores/num-cores-per-executor) = 150/5 = 30.

How many executors does a node Spark have?

Spark Architecture The central coordinator is called Spark Driver and it communicates with all the Workers. Each Worker node consists of one or more Executor(s) who are responsible for running the Task.


1 Answers

The default values for most configuration properties can be found in the Spark Configuration documentation. For the configuration properties on your example, the defaults are:

  • spark.driver.memory = 1g
  • spark.executor.memory = 1g
  • spark.executor.cores = 1 in YARN mode, all the available cores on the worker in standalone mode.

Additionally, you can override these defaults by creating the file$SPARK-HOME/conf/spark-defaults.conf with the properties you want (as described here). Then, if the file exists with the desired values, you don't need to pass them as arguments to the spark-shell command.

like image 81
Daniel de Paula Avatar answered Oct 14 '22 02:10

Daniel de Paula