Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change number of executors in local mode?

Is it possible to set multiple executors for Spark Streaming application in a local mode using some Spark Conf settings? For now, I can not see any changes in Spark UI in terms of performance or executors number increase when I change spark.executor.instances parameter to 4, for example.

like image 480
Cassie Avatar asked Sep 05 '18 15:09

Cassie


People also ask

How do I change the number of executors in a Spark?

SparkConf conf = new SparkConf() // 4 executor per instance of each worker . set("spark. executor. instances", "4") // 5 cores on each executor .

How do you decide the number of executors in a Spark job?

Number of available executors = (total cores/num-cores-per-executor) = 150/5 = 30.

How would you set number of executors say 5 of any Spark application?

The number of executors for a spark application can be specified inside the SparkConf or via the flag –num-executors from command-line. Cluster Manager : An external service for acquiring resources on the cluster (e.g. standalone manager, Mesos, YARN).

How many executors can you have in Spark?

The first Spark job starts with two executors (because the minimum number of nodes is set to two in this example). The cluster can autoscale to a maximum of ten executors (because the maximum number of nodes is set to ten).


3 Answers

Local mode is a development tool, where all components are simulated in a single machine. Since single JVM mean single executor changing of the number of executors is simply not possible, and spark.executor.instances is not applicable.

All you can do in local mode is to increase number of threads by modifying the master URL - local[n] where n is the number of threads.

like image 68
user10321164 Avatar answered Sep 28 '22 15:09

user10321164


local mode is by definition "pseudo-cluster" that runs in Single JVM. That means maximum number of executors is 1.

If you want to experiment with multiple executors on local machine, what you can do is to create cluster with couple workers running on your local machine. Number of running instances is max number of executors for your tasks.

like image 44
vvg Avatar answered Sep 28 '22 17:09

vvg


spark.executor.instances is not honoured in local mode.

Reference - https://books.japila.pl/apache-spark-internals/local/?h=local

Local-Mode: In this non-distributed single-JVM deployment mode, Spark spawns all the execution components - driver, executor, LocalSchedulerBackend, and master - in the same single JVM. The default parallelism is the number of threads as specified in the master URL. This is the only mode where a driver is used for execution.

So you can increase number of threads in JVM to n by passing master url as local[n].

like image 22
moriarty007 Avatar answered Sep 28 '22 16:09

moriarty007