Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set amount of Spark executors?

How could I configure from Java (or Scala) code amount of executors having SparkConfig and SparkContext? I see constantly 2 executors. Looks like spark.default.parallelism does not work and is about something different.

I just need to set amount of executors to be equal to cluster size but there are always only 2 of them. I know my cluster size. I run on YARN if this matters.

like image 327
Roman Nikitchenko Avatar asked Oct 02 '14 19:10

Roman Nikitchenko


People also ask

How do I set the number of executors in Spark?

According to the recommendations which we discussed above: So, Total available of cores in cluster = 15 x 10 = 150. Number of available executors = (total cores/num-cores-per-executor) = 150/5 = 30. Leaving 1 executor for ApplicationManager => --num-executors = 29. Number of executors per node = 30/10 = 3.

How many executors can you have in Spark?

The first Spark job starts with two executors (because the minimum number of nodes is set to two in this example). The cluster can autoscale to a maximum of ten executors (because the maximum number of nodes is set to ten).

How many executors can a worker node have in Spark?

If a node has good memory it can have 2 or more executors in the same machine.

What is the maximum executor memory in Spark?

Broadly set the memory between 8GB and 16GB. This is an arbitrary choice and governed by the above two points. Pack as many executors as can be assigned to one cluster node. Evenly distribute cores to all executors.


1 Answers

You could also do it programmatically by setting the parameters "spark.executor.instances" and "spark.executor.cores" on the SparkConf object.

Example:

SparkConf conf = new SparkConf()       // 4 executor per instance of each worker        .set("spark.executor.instances", "4")       // 5 cores on each executor       .set("spark.executor.cores", "5"); 

The second parameter is only for YARN and standalone mode. It allows an application to run multiple executors on the same worker, provided that there are enough cores on that worker.

like image 134
A. One Avatar answered Sep 19 '22 15:09

A. One