I'm trying to set spark.local.dir
from spark-shell using sc.getconf.set("spark.local.dir","/temp/spark")
, But it is not working. Is there any other way to set this property from sparkshell.
It's easy to run locally on one machine — all you need is to have java installed on your system PATH , or the JAVA_HOME environment variable pointing to a Java installation. Spark runs on Java 8/11/17, Scala 2.12/2.13, Python 3.7+ and R 3.5+. Java 8 prior to version 8u201 support is deprecated as of Spark 3.2.0.
Spark properties control most application parameters and can be set by using a SparkConf object, or through Java system properties. Environment variables can be used to set per-machine settings, such as the IP address, through the conf/spark-env.sh script on each node. Logging can be configured through log4j.
You can't do it from inside the shell - since the Spark context was already created, so the local dir was already set (and used). You should pass it as parameter when starting the shell:
./spark-shell --conf spark.local.dir=/temp/spark
@Tzach Zohar solution seems to be the right answer.
However, if you insist to set spark.local.dir from spark-shell you can do it:
1) close the current spark context
sc.stop()
2) updated the sc configuration, and restart it.
The updated code was kindly provided by @Tzach-Zohar:
SparkSession.builder.config(sc.getConf).config("spark.local.dir","/temp/spark").getOrCreate())
@Tzach Zohar note: "but you get a WARN SparkContext: Use an existing SparkContext, some configuration may not take effect, which suggests this isn't the recommended path to take.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With