Spark on Windows - What exactly is winutils and why do we need it?

Question

I'm curious! To my knowledge, HDFS needs datanode processes to run, and this is why it's only working on servers. Spark can run locally though, but needs winutils.exe which is a component of Hadoop. But what exactly does it do? How is it, that I cannot run Hadoop on Windows, but I can run Spark, which is built on Hadoop?

saurzcode · Accepted Answer

Though Max's answer covers the actual place where it's being referred. Let me give a brief background on why it needs it on Windows -

From Hadoop's Confluence Page itself -

Hadoop requires native libraries on Windows to work properly -that includes accessing the file:// filesystem, where Hadoop uses some Windows APIs to implement posix-like file access permissions.

This is implemented in HADOOP.DLL and WINUTILS.EXE.

In particular, %HADOOP_HOME%\BIN\WINUTILS.EXE must be locatable

And , I think you should be able to run both Spark and Hadoop on Windows.

Spark on Windows - What exactly is winutils and why do we need it?

Tags:

apache-spark

hadoop

lte__

1 Answers

saurzcode

Recent Activity

Donate For Us

Spark on Windows - What exactly is winutils and why do we need it?

Tags:

apache-spark

hadoop

lte__

1 Answers

saurzcode

Related questions

Recent Activity

Donate For Us