I have two versions of Python. When I launch a spark application using spark-submit, the application uses the default version of Python. But, I want to use the other one. How to specify the version of Python for spark-submit to use?
Spark runs on Java 8/11/17, Scala 2.12/2.13, Python 3.7+ and R 3.5+.
PySpark requires Java version 7 or later and Python version 2.6 or later.
You can set the PYSPARK_PYTHON
variable in conf/spark-env.sh
(in Spark's installation directory) to the absolute path of the desired Python executable.
Spark distribution contains spark-env.sh.template
(spark-env.cmd.template
on Windows) by default. It must be renamed to spark-env.sh
(spark-env.cmd
) first.
For example, if Python executable is installed under /opt/anaconda3/bin/python3
:
PYSPARK_PYTHON='/opt/anaconda3/bin/python3'
Check out the configuration documentation for more information.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With