I know that there is a very similar post to this one(Failed to locate the winutils binary in the hadoop binary path), however, I have tried every step that was suggested and the same error still appears.
I'm trying to use the Apache Spark version 1.6.0 on Windows 7 to perform the tutorial on this page http://spark.apache.org/docs/latest/streaming-programming-guide.html, specifically using this code:
./bin/run-example streaming.JavaNetworkWordCount localhost 9999
However, this error keeps appearing:
After reading this post Failed to locate the winutils binary in the hadoop binary path
I realized I needed the winutils.exe file, so I have downloaded a hadoop binary 2.6.0 with it, defined an Environment Variable called HADOOP_HOME:
with value C:\Users\GERAL\Desktop\hadoop-2.6.0\bin
and placed it on Path like this: %HADOOP_HOME%
Yet the same error still appears when I try the code. Does anyone know how to solve this?
What Does Spark Need WinUtils For? In order to run Apache Spark locally, it is required to use an element of the Hadoop code base known as 'WinUtils'. This allows management of the POSIX file system permissions that the HDFS file system requires of the local file system.
If you are running Spark on Windows with Hadoop, then you need to ensure your windows hadoop installation is properly installed. to run spark you need to have winutils.exe and winutils.dll in your hadoop home directory bin folder.
I would ask you to try this first:
1) You can download .dll and .exe fils from the bundle in below link.
https://codeload.github.com/sardetushar/hadooponwindows/zip/master
2) Copy winutils.exe and winutils.dll from that folder to your $HADOOP_HOME/bin.
3) Set the HADOOP_HOME
either in your spark-env.sh or at the command, and add HADOOP_HOME/bin
to PATH
.
and then try running.
If you need any assistance for hadoop installation help, there is a nice link, you can try it.
http://toodey.com/2015/08/10/hadoop-installation-on-windows-without-cygwin-in-10-mints/
But, that can wait. you can try the first few steps.
Install JDK 1.8, Download Spark Binary from Apache Spark & Winutils from Git repo
Set the user variables path for JDK, Spark binary, Winutils
JAVA_HOME
C:\Program Files\Java\jdk1.8.0_73
HADOOP_HOME
C:\Hadoop
SPARK_HOME
C:\spark-2.3.1-bin-hadoop2.7
PATH
C:\Program Files\Java\jdk1.8.0_73\bin;%HADOOP_HOME%\bin;%SPARK_HOME%\bin;
Open command prompt and run spark-shell
Spark Shell
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With