For testing purposes I want to have Spark 2.x run in a local mode. How can I do this? Can I do this? Currently I write in a main
:
val spark = SparkSession
.builder
.appName("RandomForestClassifierExample")
.getOrCreate()
and run the main in IntelliJ, but I get the error
org.apache.spark.SparkException: A master URL must be set in your configuration
I guess I need to have some local instance running or set a local mode or something like that. What should I do exactly?
So, how do you run the spark in local mode? It is very simple. When we do not specify any --master flag to the command spark-shell, pyspark, spark-submit, or any other binary, it is running in local mode. Or we can specify --master option with local as argument which defaults to 1 thread.
To create SparkSession in Scala or Python, you need to use the builder pattern method builder() and calling getOrCreate() method. If SparkSession already exists it returns otherwise creates a new SparkSession. SparkSession. builder() – Return SparkSession.
You should configure a .master(..)
before calling getOrCreate
:
val spark = SparkSession.builder
.master("local")
.appName("RandomForestClassifierExample")
.getOrCreate()
"local" means all of Spark's components (master, executors) will run locally within your single JVM running this code (very convenient for tests, pretty much irrelevant for real world scenarios). Read more about other "master" options here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With