Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark 1.6-Failed to locate the winutils binary in the hadoop binary path

I know that there is a very similar post to this one(Failed to locate the winutils binary in the hadoop binary path), however, I have tried every step that was suggested and the same error still appears.

I'm trying to use the Apache Spark version 1.6.0 on Windows 7 to perform the tutorial on this page http://spark.apache.org/docs/latest/streaming-programming-guide.html, specifically using this code:

./bin/run-example streaming.JavaNetworkWordCount localhost 9999

However, this error keeps appearing: enter image description here

After reading this post Failed to locate the winutils binary in the hadoop binary path

I realized I needed the winutils.exe file, so I have downloaded a hadoop binary 2.6.0 with it, defined an Environment Variable called HADOOP_HOME:

 with value C:\Users\GERAL\Desktop\hadoop-2.6.0\bin  

and placed it on Path like this: %HADOOP_HOME%

Yet the same error still appears when I try the code. Does anyone know how to solve this?

like image 389
manuel mourato Avatar asked Jan 09 '16 19:01

manuel mourato


People also ask

Why do we need Winutils for spark?

What Does Spark Need WinUtils For? In order to run Apache Spark locally, it is required to use an element of the Hadoop code base known as 'WinUtils'. This allows management of the POSIX file system permissions that the HDFS file system requires of the local file system.


2 Answers

If you are running Spark on Windows with Hadoop, then you need to ensure your windows hadoop installation is properly installed. to run spark you need to have winutils.exe and winutils.dll in your hadoop home directory bin folder.

I would ask you to try this first:

1) You can download .dll and .exe fils from the bundle in below link.

https://codeload.github.com/sardetushar/hadooponwindows/zip/master

2) Copy winutils.exe and winutils.dll from that folder to your $HADOOP_HOME/bin.

3) Set the HADOOP_HOME either in your spark-env.sh or at the command, and add HADOOP_HOME/bin to PATH.

and then try running.

If you need any assistance for hadoop installation help, there is a nice link, you can try it.

http://toodey.com/2015/08/10/hadoop-installation-on-windows-without-cygwin-in-10-mints/

But, that can wait. you can try the first few steps.

like image 143
Srini Avatar answered Oct 18 '22 05:10

Srini


Install JDK 1.8, Download Spark Binary from Apache Spark & Winutils from Git repo

Set the user variables path for JDK, Spark binary, Winutils

JAVA_HOME
C:\Program Files\Java\jdk1.8.0_73

HADOOP_HOME
C:\Hadoop

SPARK_HOME
C:\spark-2.3.1-bin-hadoop2.7

PATH
C:\Program Files\Java\jdk1.8.0_73\bin;%HADOOP_HOME%\bin;%SPARK_HOME%\bin;

Open command prompt and run spark-shell

Spark Shell

like image 37
satish hiremath Avatar answered Oct 18 '22 04:10

satish hiremath