Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark configuration change in runtime

So asking if anyone knows a way to change the Spark properties (e.g. spark.executor.memory, spark.shuffle.spill.compress, etc) during runtime, so that a change may take effect between the tasks/stages during a job...

So I know that...

1) The documentation for Spark 2.0+ (and previous versions too) state that once the Spark Context has been created, it can't be changed in runtime.

2) SparkSession.conf.set that may change a few things for SQL, but I was looking at more general, all encompassing configurations.

3) I could start a new context in the program with new properties, but the case here is to actually tune the properties once a job is already executing.

Ideas...

1) Would killing an Executor force it to read a configuration file again, or does it just get what's already configured during the beginning of the job?

2) Is there any command to force a "refresh" of the properties in spark context?

So hoping there might be a way or other ideas out there (thanks in advance)...

like image 448
Tiago Perez Avatar asked Mar 17 '26 00:03

Tiago Perez


1 Answers

After submitting the Spark application, we can change a few parameter values at Runtime and a few not.

By using spark.conf.isModifiable() method, we can check parameter value we can modify at runtime or not. If the value returns true then we can modify the parameter value otherwise, we can't modify the value at runtime.

Examples:

>>> spark.conf.isModifiable("spark.executor.memory")
False 
>>> spark.conf.isModifiable("spark.sql.shuffle.partitions")
True

So based on the above testing, we can't modify the spark.executor.memory parameter value at runtime.

like image 90
Ranga Reddy Avatar answered Mar 19 '26 12:03

Ranga Reddy