Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ERROR: Unable to find py4j, your SPARK_HOME may not be configured correctly

I'm unable to run below import in Jupyter notebook.

findspark.init('home/ubuntu/spark-3.0.0-bin-hadoop3.2')

Getting this following error:

    ---------------------------------------------------------------------------
~/.local/lib/python3.6/site-packages/findspark.py in init(spark_home, python_path, edit_rc, edit_profile)
    144     except IndexError:
    145         raise Exception(
--> 146             "Unable to find py4j, your SPARK_HOME may not be configured correctly"
    147         )
    148     sys.path[:0] = [spark_python, py4j]

Exception: Unable to find py4j, your SPARK_HOME may not be configured correctly

I do have py4j installed and also tried to add these below lines into ~/.bashrc

export SPARK_HOME=/home/ubuntu/spark-3.0.0-bin-hadoop3.2
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.10.9-src.zip:$PYTHONPATH
like image 635
Sushmita098 Avatar asked Aug 25 '20 05:08

Sushmita098


Video Answer


1 Answers

Check if the spark version you installed is the same that you declare under SPARK_HOME name

For example (in Google Colab), I've installed:

!wget -q https://downloads.apache.org/spark/spark-3.0.1/spark-3.0.1-bin-hadoop3.2.tgz

and then I declare:

os.environ["SPARK_HOME"] = "/content/spark-3.0.1-bin-hadoop3.2"

Look that spark-3.0.1-bin-hadoop3.2 must be same in both places

like image 159
Adam Kuzański Avatar answered Sep 22 '22 23:09

Adam Kuzański