I just downloaded spark-2.3.0-bin-hadoop2.7.tgz. After downloading I followed the steps mentioned here pyspark installation for windows 10.I used the comment bin\pyspark to run the spark & got error message
The system cannot find the path specified
Attached is the screen shot of error message
Attached is the screen shot of my spark bin folder
Screen shot of my path variable looks like
I have python 3.6 & Java "1.8.0_151" in my windows 10 system Can you suggest me how to resolve this issue?
Actually, the problem was with the JAVA_HOME
environment variable path. The JAVA_HOME
path was set to .../jdk/bin previously
,
I stripped the last /bin
part for JAVA_HOME
while keeping it (/jdk/bin
) in system or environment path variable (%path%
) did the trick.
My problem was that the JAVA_HOME was pointing to JRE folder instead of JDK. Make sure that you take care of that
Worked hours and hours on this. My problem was with Java 10 installation. I uninstalled it and installed Java 8, and now Pyspark works.
Switching SPARK_HOME to C:\spark\spark-2.3.0-bin-hadoop2.7
and changing PATH to include %SPARK_HOME%\bin
did the trick for me.
Originally my SPARK_HOME was set to C:\spark\spark-2.3.0-bin-hadoop2.7\bin
and PATH was referencing it as %SPARK_HOME%
.
Running a spark command directly in my SPARK_HOME dir worked but only once. After that initial success I then noticed your same error and that echo %SPARK_HOME%
was showing C:\spark\spark-2.3.0-bin-hadoop2.7\bin\..
I thought perhaps spark-shell2.cmd had edited it in attempts to get itself working, which led me here.
For those who use Windows and still trying, what solved to me was reinstalling Python (3.9) as a local user (c:\Users\<user>\AppData\Local\Programs\Python
) and defined both env variables PYSPARK_PYTHON
and PYSPARK_DRIVER_PYTHON
to c:\Users\<user>\AppData\Local\Programs\Python\python.exe
Fixing problems installing Pyspark (Windows)
Incorrect JAVA_HOME path
> pyspark
The system cannot find the path specified.
Open System Environment variables:
rundll32 sysdm.cpl,EditEnvironmentVariables
Set JAVA_HOME: System Variables > New:
Variable Name: JAVA_HOME
Variable Value: C:\Program Files\Java\jdk1.8.0_261
Also, check that SPARK_HOME and HADOOP_HOME are correctly set, e.g.:
SPARK_HOME=C:\Spark\spark-3.2.0-bin-hadoop3.2
HADOOP_HOME=C:\Spark\spark-3.2.0-bin-hadoop3.2
Important: Double-check the following
bin
folderIncorrect Java version
> pyspark
WARN SparkContext: Another SparkContext is being constructed
UserWarning: Failed to initialize Spark session.
java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.storage.StorageUtils$
Ensure that JAVA_HOME is set to Java 8 (jdk1.8.0)
winutils not installed
> pyspark
WARN Shell: Did not find winutils.exe
java.io.FileNotFoundException: Could not locate Hadoop executable
Download winutils.exe and copy it to your spark home bin folder
curl -OutFile C:\Spark\spark-3.2.0-bin-hadoop3.2\bin\winutils.exe -Uri https://github.com/steveloughran/winutils/raw/master/hadoop-3.0.0/bin/winutils.exe
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With