So asking if anyone knows a way to change the Spark properties (e.g. spark.executor.memory, spark.shuffle.spill.compress, etc) during runtime, so that a change may take effect between the tasks/stages during a job...
So I know that...
1) The documentation for Spark 2.0+ (and previous versions too) state that once the Spark Context has been created, it can't be changed in runtime.
2) SparkSession.conf.set that may change a few things for SQL, but I was looking at more general, all encompassing configurations.
3) I could start a new context in the program with new properties, but the case here is to actually tune the properties once a job is already executing.
Ideas...
1) Would killing an Executor force it to read a configuration file again, or does it just get what's already configured during the beginning of the job?
2) Is there any command to force a "refresh" of the properties in spark context?
So hoping there might be a way or other ideas out there (thanks in advance)...
After submitting the Spark application, we can change a few parameter values at Runtime and a few not.
By using spark.conf.isModifiable() method, we can check parameter value we can modify at runtime or not. If the value returns true then we can modify the parameter value otherwise, we can't modify the value at runtime.
Examples:
>>> spark.conf.isModifiable("spark.executor.memory")
False
>>> spark.conf.isModifiable("spark.sql.shuffle.partitions")
True
So based on the above testing, we can't modify the spark.executor.memory parameter value at runtime.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With