Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set spark.local.dir property from spark shell?

I'm trying to set spark.local.dir from spark-shell using sc.getconf.set("spark.local.dir","/temp/spark"), But it is not working. Is there any other way to set this property from sparkshell.

like image 595
VSP Avatar asked Nov 02 '16 04:11

VSP


People also ask

How do I run spark-shell locally?

It's easy to run locally on one machine — all you need is to have java installed on your system PATH , or the JAVA_HOME environment variable pointing to a Java installation. Spark runs on Java 8/11/17, Scala 2.12/2.13, Python 3.7+ and R 3.5+. Java 8 prior to version 8u201 support is deprecated as of Spark 3.2.0.

Where are spark properties set?

Spark properties control most application parameters and can be set by using a SparkConf object, or through Java system properties. Environment variables can be used to set per-machine settings, such as the IP address, through the conf/spark-env.sh script on each node. Logging can be configured through log4j.


2 Answers

You can't do it from inside the shell - since the Spark context was already created, so the local dir was already set (and used). You should pass it as parameter when starting the shell:

./spark-shell --conf spark.local.dir=/temp/spark
like image 186
Tzach Zohar Avatar answered Oct 02 '22 01:10

Tzach Zohar


@Tzach Zohar solution seems to be the right answer.

However, if you insist to set spark.local.dir from spark-shell you can do it:

1) close the current spark context

    sc.stop()

2) updated the sc configuration, and restart it.

The updated code was kindly provided by @Tzach-Zohar:

SparkSession.builder.config(sc.getConf).config("spark.local.‌​dir","/temp/spark").‌​getOrCreate())

@Tzach Zohar note: "but you get a WARN SparkContext: Use an existing SparkContext, some configuration may not take effect, which suggests this isn't the recommended path to take.

like image 41
Yaron Avatar answered Oct 02 '22 01:10

Yaron