Is it possible to set multiple executors for Spark Streaming application in a local mode using some Spark Conf settings? For now, I can not see any changes in Spark UI in terms of performance or executors number increase when I change spark.executor.instances parameter to 4, for example.
SparkConf conf = new SparkConf() // 4 executor per instance of each worker . set("spark. executor. instances", "4") // 5 cores on each executor .
Number of available executors = (total cores/num-cores-per-executor) = 150/5 = 30.
The number of executors for a spark application can be specified inside the SparkConf or via the flag –num-executors from command-line. Cluster Manager : An external service for acquiring resources on the cluster (e.g. standalone manager, Mesos, YARN).
The first Spark job starts with two executors (because the minimum number of nodes is set to two in this example). The cluster can autoscale to a maximum of ten executors (because the maximum number of nodes is set to ten).
Local mode is a development tool, where all components are simulated in a single machine. Since single JVM mean single executor changing of the number of executors is simply not possible, and spark.executor.instances
is not applicable.
All you can do in local
mode is to increase number of threads by modifying the master URL - local[n]
where n
is the number of threads.
local mode is by definition "pseudo-cluster" that runs in Single JVM. That means maximum number of executors is 1.
If you want to experiment with multiple executors on local machine, what you can do is to create cluster with couple workers running on your local machine. Number of running instances is max number of executors for your tasks.
spark.executor.instances
is not honoured in local mode.
Reference - https://books.japila.pl/apache-spark-internals/local/?h=local
Local-Mode: In this non-distributed single-JVM deployment mode, Spark spawns all the execution components - driver, executor, LocalSchedulerBackend, and master - in the same single JVM. The default parallelism is the number of threads as specified in the master URL. This is the only mode where a driver is used for execution.
So you can increase number of threads in JVM to n by passing master url as local[n]
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With