Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Pyspark command not recognised

I have anaconda installed and also I have downloaded Spark 1.6.2. I am using the following instructions from this answer to configure spark for Jupyter enter link description here

I have downloaded and unzipped the spark directory as


Now when I cd into this directory and into bin I see the following

SFOM00618927A:spark $ cd bin
SFOM00618927A:bin $ ls
beeline         pyspark         run-example.cmd     spark-class2.cmd    spark-sql       sparkR
beeline.cmd     pyspark.cmd     run-example2.cmd    spark-shell     spark-submit        sparkR.cmd
load-spark-env.cmd  pyspark2.cmd        spark-class     spark-shell.cmd     spark-submit.cmd    sparkR2.cmd
load-spark-env.sh   run-example     spark-class.cmd     spark-shell2.cmd    spark-submit2.cmd

I have also added the environment variables as mentioned in the above answer to my .bash_profile and .profile

Now in the spark/bin directory first thing I want to check is if pyspark command works on shell first.

So I do this after doing cd spark/bin

SFOM00618927A:bin $ pyspark
-bash: pyspark: command not found

As per the answer after following all the steps I can just do


in terminal in any directory and it should start a jupyter notebook with spark engine. But even the pyspark within the shell is not working forget about making it run on juypter notebook

Please advise what is going wrong here.


I did

open .profile 

at home directory and this is what is stored in the path.

export PATH=/Users/854319/anaconda/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/Users/854319/spark/bin
export PYSPARK_DRIVER_PYTHON_OPTS='notebook' pyspark
like image 472
Baktaawar Avatar asked Aug 05 '16 22:08


1 Answers

1- You need to set JAVA_HOME and spark paths for the shell to find them. After setting them in your .profile you may want to

source ~/.profile

to activate the setting in the current session. From your comment I can see you're already having the JAVA_HOME issue.

Note if you have .bash_profile or .bash_login, .profile will not work as described here

2- When you are in spark/bin you need to run


to tell the shell that the target is in the current folder.

like image 172
shuaiyuancn Avatar answered Sep 29 '22 14:09
