Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When specifying local[n1,n2,n3] for spark master, what are the three parameters?

Tags:

apache-spark

For launching spark, I have seen:

--master local[n1,n2,n3] 

where n1, n2, and n3 are integers.

What do these refer to ?

like image 787
WestCoastProjects Avatar asked May 01 '15 22:05

WestCoastProjects


People also ask

What is Spark master local?

setMaster(local[*]) is to run Spark locally with as many worker threads as logical cores on your machine. Alternatively, you can also set this value with the spark-shell or spark-submit command. Below is an example of how to use it with spark shell.

What is local in Spark?

Based on the resource manager, the spark can run in two modes: Local Mode and cluster mode. The way we specify the resource manager is by the way of a command-line option called --master. Local Mode is also known as Spark in-process is the default mode of spark.

Which of the following acts as a master of the Spark application?

SparkContext is a client of Spark execution environment and acts as the master of Spark application.


2 Answers

The master specification is parsed in SparkContext.createTaskScheduler. (See the link for the implementation.) The possibilities with local are:

  • local uses 1 thread.
  • local[N] uses N threads.
  • local[*] uses as many threads as there are cores.
  • local[N, M] and local[*, M] are like above, but set the maximal task failures to M. This allows you to enable retries when running locally. (Normally local retries are disabled. Enabling them is useful for testing.)
  • local-cluster[numSlaves, coresPerSlave, memoryPerSlave] starts executors in separate processes as configured, but it does not require running workers and masters. It's a lightweight way to simulate a cluster in unit tests. (See also SPARK-595.)
like image 157
Daniel Darabos Avatar answered Oct 07 '22 05:10

Daniel Darabos


only these parameters are supported for "local" master mode :

  • local : one thread
  • local[n] : n threads
  • local[*] : as much thread as possible considering your CPUs

c.f. https://spark.apache.org/docs/latest/submitting-applications.html#master-urls

Regards,

Olivier.

like image 45
Olivier Girardot Avatar answered Oct 07 '22 06:10

Olivier Girardot