I am trying to run Spark using yarn and I am running into this error:
Exception in thread "main" java.lang.Exception: When running with master 'yarn' either HADOOP_CONF_DIR
or YARN_CONF_DIR
must be set in the environment.
I am not sure where the "environment" is (what specific file?). I tried using:
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
in the bash_profile, but this doesn't seem to help.
Also, CDH cluster's HADOOP_CONF_DIR should by default be set to /etc/hadoop/conf .
While running spark using Yarn, you need to add following line in to spark-env.sh
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
Note: check $HADOOP_HOME/etc/hadoop is correct one in your environment. And spark-env.sh contains export of HADOOP_HOME as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With