I'm trying to get the path to spark.worker.dir
for the current sparkcontext
.
If I explicitly set it as a config param
, I can read it back out of SparkConf
, but is there anyway to access the complete config
(including all defaults) using PySpark
?
In Spark/PySpark you can get the current active SparkContext and its configuration settings by accessing spark. sparkContext. getConf.
There is no option of viewing the spark configuration properties from command line. Instead you can check it in spark-default. conf file. Another option is to view from webUI.
A SparkContext represents the connection to a Spark cluster, and can be used to create RDD and broadcast variables on that cluster. When you create a new SparkContext, at least the master and app name should be set, either through the named parameters here or through conf .
In Spark or PySpark SparkSession object is created programmatically using SparkSession. builder() and if you are using Spark shell SparkSession object “ spark ” is created by default for you as an implicit object whereas SparkContext is retrieved from the Spark session object by using sparkSession. sparkContext .
Spark 2.1+
spark.sparkContext.getConf().getAll()
where spark
is your sparksession
(gives you a dict
with all configured settings)
Yes: sc.getConf().getAll()
Which uses the method:
SparkConf.getAll()
as accessed by
SparkContext.sc.getConf()
But it does work:
In [4]: sc.getConf().getAll() Out[4]: [(u'spark.master', u'local'), (u'spark.rdd.compress', u'True'), (u'spark.serializer.objectStreamReset', u'100'), (u'spark.app.name', u'PySparkShell')]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With