NOTE: They author is looking for answers to set the Spark Master when running Spark examples that involves no changes to the source code, but rather only options that can be done from the command-line if at all possible.
Let us consider the run() method of the BinaryClassification example:
def run(params: Params) { val conf = new SparkConf().setAppName(s"BinaryClassification with $params") val sc = new SparkContext(conf)
Notice that the SparkConf did not provide any means to configure the SparkMaster.
When running this program from Intellij with the following arguments:
--algorithm LR --regType L2 --regParam 1.0 data/mllib/sample_binary_classification_data.txt
the following error occurs:
Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration at org.apache.spark.SparkContext.<init>(SparkContext.scala:166) at org.apache.spark.examples.mllib.BinaryClassification$.run(BinaryClassification.scala:105)
I have also tried adding in the Spark Master url anyways (though the code seems NOT to support it ..)
spark://10.213.39.125:17088 --algorithm LR --regType L2 --regParam 1.0 data/mllib/sample_binary_classification_data.txt
and
--algorithm LR --regType L2 --regParam 1.0 spark://10.213.39.125:17088 data/mllib/sample_binary_classification_data.txt
Both do not work with error:
Error: Unknown argument 'data/mllib/sample_binary_classification_data.txt'
For reference here is the options parsing - which does nothing with SparkMaster:
val parser = new OptionParser[Params]("BinaryClassification") { head("BinaryClassification: an example app for binary classification.") opt[Int]("numIterations") .text("number of iterations") .action((x, c) => c.copy(numIterations = x)) opt[Double]("stepSize") .text(s"initial step size, default: ${defaultParams.stepSize}") .action((x, c) => c.copy(stepSize = x)) opt[String]("algorithm") .text(s"algorithm (${Algorithm.values.mkString(",")}), " + s"default: ${defaultParams.algorithm}") .action((x, c) => c.copy(algorithm = Algorithm.withName(x))) opt[String]("regType") .text(s"regularization type (${RegType.values.mkString(",")}), " + s"default: ${defaultParams.regType}") .action((x, c) => c.copy(regType = RegType.withName(x))) opt[Double]("regParam") .text(s"regularization parameter, default: ${defaultParams.regParam}") arg[String]("<input>") .required() .text("input paths to labeled examples in LIBSVM format") .action((x, c) => c.copy(input = x))
So .. yes .. I could go ahead and modify the source code. But I suspect instead I am missing an available tuning knob to make this work that does not involve modifying the source code.
You can set the Spark master from the command-line by adding the JVM parameter:
-Dspark.master=spark://myhost:7077
If you want to get this done from code you can use .setMaster(...)
when creating the SparkConf
:
val conf = new SparkConf().setAppName("Simple Application") .setMaster("spark://myhost:7077")
Long overdue EDIT (as per the comments)
For the session in Spark 2.x +:
val spark = SparkSession.builder() .appName("app_name") .getOrCreate()
Command line (2.x) assuming local standalone cluster.
spark-shell --master spark://localhost:7077
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With