Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spark.conf.set("spark.driver.maxResultSize", '6g') is not updating the default value - PySpark

I am trying to update the spark.driver.maxResultSize value to 6g but the value is not getting update.

spark.conf.set("spark.driver.maxResultSize", '6g')

Note: I am running this command in Azure Databricks Notebook.

like image 476
Kiruparan Balachandran Avatar asked Sep 16 '25 08:09

Kiruparan Balachandran


1 Answers

In Spark 2.0+ you should be able to use SparkSession.conf.set method to set some configuration option at runtime but it's mostly limited to SQL configuration. Since you're trying to update the conf of spark.driver, you need to SparkSession.builder.getOrCreate new session with your new conf (if you have one running). Such as:

import pyspark

sc = spark.sparkContext
conf = pyspark.SparkConf().setAll([("spark.driver.maxResultSize", '6g')])

# here you stop the old spark context with old conf
sc.stop()
sc = pyspark.SparkContext(conf=conf)

Alternatively, you can just getOrCreate new session with predefined config, e.g. from a YAML file or from code. You can then check the new conf yourself using

sc.getConf().getAll()
like image 101
Napoleon Borntoparty Avatar answered Sep 19 '25 15:09

Napoleon Borntoparty