I am trying to overwrite the spark session/spark context default configs, but it is picking entire node/cluster resource.
spark = SparkSession.builder .master("ip") .enableHiveSupport() .getOrCreate() spark.conf.set("spark.executor.memory", '8g') spark.conf.set('spark.executor.cores', '3') spark.conf.set('spark.cores.max', '3') spark.conf.set("spark.driver.memory",'8g') sc = spark.sparkContext
It works fine when i put the configuration in spark submit
spark-submit --master ip --executor-cores=3 --diver 10G code.py
Once the SparkSession is instantiated, we can configure Spark's run-time config properties. Spark 2.0. 0 onwards, it is better to use sparkSession as it provides access to all the spark Functionalities that sparkContext does. Also, it provides APIs to work on DataFrames and Datasets.
You aren't actually overwriting anything with this code. Just so you can see for yourself try the following.
As soon as you start pyspark shell type:
sc.getConf().getAll()
This will show you all of the current config settings. Then try your code and do it again. Nothing changes.
What you should do instead is create a new configuration and use that to create a SparkContext. Do it like this:
conf = pyspark.SparkConf().setAll([('spark.executor.memory', '8g'), ('spark.executor.cores', '3'), ('spark.cores.max', '3'), ('spark.driver.memory','8g')]) sc.stop() sc = pyspark.SparkContext(conf=conf)
Then you can check yourself just like above with:
sc.getConf().getAll()
This should reflect the configuration you wanted.
update configuration in Spark 2.3.1
To change the default spark configurations you can follow these steps:
Import the required classes
from pyspark.conf import SparkConf from pyspark.sql import SparkSession
Get the default configurations
spark.sparkContext._conf.getAll()
Update the default configurations
conf = spark.sparkContext._conf.setAll([('spark.executor.memory', '4g'), ('spark.app.name', 'Spark Updated Conf'), ('spark.executor.cores', '4'), ('spark.cores.max', '4'), ('spark.driver.memory','4g')])
Stop the current Spark Session
spark.sparkContext.stop()
Create a Spark Session
spark = SparkSession.builder.config(conf=conf).getOrCreate()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With