I built Spark 1.4 from the GH development master, and the build went through fine. But when I do a bin/pyspark
I get the Python 2.7.9 version. How can I change this?
The current version of PySpark is 2.4. 3 and works with Python 2.7, 3.3, and above.
PySpark requires Java version 7 or later and Python version 2.6 or later.
PySpark is a Spark library written in Python to run Python applications using Apache Spark capabilities. so there is no PySpark library to download. All you need is Spark.
Just set the environment variable:
export PYSPARK_PYTHON=python3
in case you want this to be a permanent change add this line to pyspark script.
PYSPARK_PYTHON=python3
./bin/pyspark
If you want to run in in IPython Notebook, write:
PYSPARK_PYTHON=python3
PYSPARK_DRIVER_PYTHON=ipython
PYSPARK_DRIVER_PYTHON_OPTS="notebook"
./bin/pyspark
If python3
is not accessible, you need to pass path to it instead.
Bear in mind that the current documentation (as of 1.4.1) has outdate instructions. Fortunately, it has been patched.
1,edit profile :vim ~/.profile
2,add the code into the file: export PYSPARK_PYTHON=python3
3, execute command : source ~/.profile
4, ./bin/pyspark
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With