conda create -y -n py38 python=3.8
conda activate py38
pip install pyspark
# Successfully installed py4j-0.10.7 pyspark-2.4.5
python -c "import pyspark"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/__init__.py", line 51, in <module>
from pyspark.context import SparkContext
File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/context.py", line 31, in <module>
from pyspark import accumulators
File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/accumulators.py", line 97, in <module>
from pyspark.serializers import read_int, PickleSerializer
File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/serializers.py", line 72, in <module>
from pyspark import cloudpickle
File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/cloudpickle.py", line 145, in <module>
_cell_set_template_code = _make_cell_set_template_code()
File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/cloudpickle.py", line 126, in _make_cell_set_template_code
return types.CodeType(
TypeError: an integer is required (got type bytes)
It seems that Pyspark comes with pre-packaged version of cloudpickle
package that had some issues on Python 3.8, which are now resolved (at least as of version 1.3.0) on pip version, however Pyspark version is still broken. Did anyone face the same issue/had any luck resolving this?
you must downgrade your python version from 3.8 to 3.7 because pyspark doesn't support this version of python.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With