I already have Hadoop 3.0.0
installed. Should I now install the with-hadoop or without-hadoop version of Apache Spark from this page?
I am following this guide to get started with Apache Spark.
It says
Download the latest version of Apache Spark (Pre-built according to your Hadoop version) from this link:...
But I am confused. If I already have an instance of Hadoop running in my machine, and then I download, install and run Apache-Spark-WITH-Hadoop, won't it start another additional instance of Hadoop?
First off, Spark does not yet support Hadoop 3, as far as I know. You'll notice this by no available option for "your Hadoop version" available for download.
You can try setting HADOOP_CONF_DIR
and HADOOP_HOME
in your spark-env.sh, though, regardless of which you download.
You should always download the version without Hadoop if you already have it.
won't it start another additional instance of Hadoop?
No. You still would need to explicitly configure and start that version of Hadoop.
That Spark option is already configured to use the included Hadoop, I believe
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With