Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Number of Executors in Spark Local Mode

So I am running a spark job in local mode. I use the following command to run the job

spark-submit --master local[*] --driver-memory 256g --class main.scala.mainClass target/scala-2.10/spark_proj-assembly-1.0.jar 0 large.csv 100 outputFolder2 10

I am running this on a machine with 32 Cores and 256GB RAM. When creating the conf i use the following code

val conf = new SparkConf().setMaster("local[*]").setAppName("My App")

Now I now in local mode, Spark runs everything inside a single JVM, but does that mean it launches only one driver and use it as executor as well. In my time line it shows one executor driver added. And when I go the the Executors page, there is just one executor with 32 cores assigned to it One Executor Added in Time Line

One Executor with 32 Cores Is this the default behavior ? I was expecting spark would launch one executor per core instead of just one executor that gets all the core. If some one can explain the behavior, that would be great

like image 715
Hassan Jalil Avatar asked Jun 16 '17 13:06

Hassan Jalil


People also ask

How many executors are in Spark local mode?

local mode is by definition "pseudo-cluster" that runs in Single JVM. That means maximum number of executors is 1.

How many executors does Spark have?

One executor is created on each node allocated with Slurm when using Spark in the standalone mode (so that 5 executors would be created in the above example).

How do you determine the number of executors in a Spark?

Number of available executors = (total cores/num-cores-per-executor) = 150/5 = 30.

What is the default number of executors in Spark?

instances . The maximum number of executors to be used. Its Spark submit option is --max-executors . If it is not set, default is 2.


1 Answers

Is this the default behavior?

In local mode, your driver + executors are, as you've said, created inside a single JVM process. What you see isn't an executor, it is a view of how many cores your job has at its disposable. Usually when running under local mode, you should only be seeing the driver in the executors view.

If you look at the code for LocalSchedulerBackend, you'll see the following comment:

/**
 * Used when running a local version of Spark where the executor, backend, and master all run in
 * the same JVM. It sits behind a [[TaskSchedulerImpl]] and handles launching tasks on a single
 * Executor (created by the [[LocalSchedulerBackend]]) running locally.

We have a single, in the same JVM instance executor which handles all tasks.

like image 74
Yuval Itzchakov Avatar answered Oct 14 '22 20:10

Yuval Itzchakov