What does setMaster `local[*]` mean in spark?

Tags:

apache-spark

I found some code to start spark locally with:

val conf = new SparkConf().setAppName("test").setMaster("local[*]") val ctx = new SparkContext(conf)

What does the [*] mean?

280

asked Sep 02 '15 14:09

2 Answers

From the doc:

./bin/spark-shell --master local[2]

The --master option specifies the master URL for a distributed cluster, or local to run locally with one thread, or local[N] to run locally with N threads. You should start by using local for testing.

And from here:

local[*] Run Spark locally with as many worker threads as logical cores on your machine.

answered Oct 19 '22 15:10

ccheneson

Master URL Meaning

local : Run Spark locally with one worker thread (i.e. no parallelism at all).

local[K] : Run Spark locally with K worker threads (ideally, set this to the number of cores on your machine).

local[K,F] : Run Spark locally with K worker threads and F maxFailures (see spark.task.maxFailures for an explanation of this variable)

local[*] : Run Spark locally with as many worker threads as logical cores on your machine.

local[*,F] : Run Spark locally with as many worker threads as logical cores on your machine and F maxFailures.

spark://HOST:PORT : Connect to the given Spark standalone cluster master. The port must be whichever one your master is configured to use, which is 7077 by default.

spark://HOST1:PORT1,HOST2:PORT2 : Connect to the given Spark standalone cluster with standby masters with Zookeeper. The list must have all the master hosts in the high availability cluster set up with Zookeeper. The port must be whichever each master is configured to use, which is 7077 by default.

mesos://HOST:PORT : Connect to the given Mesos cluster. The port must be whichever you have configured to use, which is 5050 by default. Or, for a Mesos cluster using ZooKeeper, use mesos://zk://.... To submit with --deploy-mode cluster, the HOST:PORT should be configured to connect to the MesosClusterDispatcher.

yarn : Connect to a YARN cluster in client or cluster mode depending on the value of --deploy-mode. The cluster location will be found based on the HADOOP_CONF_DIR or YARN_CONF_DIR variable.

https://spark.apache.org/docs/latest/submitting-applications.html

answered Oct 19 '22 16:10

FreeMan

Related questions
                            
                                Why does join fail with "java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]"?
                            
                                Canonical way for empty Array in Scala?
                            
                                Apache Spark logging within Scala
                            
                                What's the current state of static analysis tools for Scala?
                            
                                Akka - How many instances of an actor should you create?
                            
                                Decomposing tuples in function arguments
                            
                                What is monoid homomorphism exactly?
                            
                                How do you remove the _<scala-version> postfix from artifacts built+published with simple-build-tool?
                            
                                What is the idiomatic scala way of finding, if a given string contains a given substring?
                            
                                What are type classes in Scala useful for?
                            
                                Provide schema while reading csv file as a dataframe
                            
                                reduceByKey: How does it work internally?
                            
                                Scala: Remove duplicates in list of objects
                            
                                Write to multiple outputs by key Spark - one Spark job
                            
                                Design patterns/best practice for building Actor-based system
                            
                                How to find out which Play version I'm using?
                            
                                Asynchronous IO in Scala with futures
                            
                                Scala: Boolean to Option
                            
                                How can I easily get a Scala case class's name?
                            
                                SBT: Start a command line 'run' of the main class of a non-default project

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What does setMaster `local[*]` mean in spark?

Tags:

scala

apache-spark

Freewind

People also ask

2 Answers

ccheneson

FreeMan

Recent Activity

Donate For Us