When creating a Spark context in PySpark, I typically use the following code:
conf = (SparkConf().setMaster("yarn-client").setAppName(appname)
.set("spark.executor.memory", "10g")
.set("spark.executor.instances", "7")
.set("spark.driver.memory", "5g")
.set("spark.shuffle.service.enabled","true")
.set("spark.dynamicAllocation.enabled","true")
.set("spark.dynamicAllocation.minExecutors","5")
)
sc = SparkContext(conf=conf)
However, this puts it in the default queue, which is almost always over capacity. We have several less busy queues available, so my question is - how do I set my Spark context to use another queue?
Edit: To clarify - I'm looking to set the queue for interactive jobs (e.g., exploratory analysis in a Jupyter notebook), so I can't set the queue with spark-submit.
You can use below argument in you spark-submit command.
--queue queue_name
You can set this property in your code. spark.yarn.queue
Hope this will help.
Thanks
Try to use spark.yarn.queue
rather than queue
.
conf = pyspark.SparkConf().set("spark.yarn.queue", "your_queue_name")
sc
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With