Value for HADOOP_CONF_DIR from Cluster

Tags:

apache-spark

hadoop-yarn

I have setup a cluster(YARN) using Ambari with 3 VMs as hosts.

Where I can find the value for HADOOP_CONF_DIR ?

# Run on a YARN cluster
export HADOOP_CONF_DIR=XXX
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master yarn-cluster \  # can also be `yarn-client` for client mode
  --executor-memory 20G \
  --num-executors 50 \
  /path/to/examples.jar \
  1000

359

asked Dec 17 '15 11:12

nish1013

1 Answers

Install Hadoop as well. In my case I've installed it in /usr/local/hadoop

Setup Hadoop Environment Variables

export HADOOP_INSTALL=/usr/local/hadoop

Then set the conf directory

export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop

190

answered Sep 24 '22 08:09

Saurabh

Related questions
                            
                                Parquet predicate pushdown
                            
                                How to map variable names to features after pipeline
                            
                                Find minimum for a timestamp through Spark groupBy dataframe
                            
                                Config file to define JSON Schema Structure in PySpark
                            
                                Spark Context is not automatically created in Scala Spark Shell
                            
                                Number of Executors in Spark Local Mode
                            
                                How to convert a string column with milliseconds to a timestamp with milliseconds in Spark 2.1 using Scala?
                            
                                Spark: converting GMT time stamps to Eastern taking daylight savings into account
                            
                                How many SparkSessions can a single application have?
                            
                                How to get a string representation of DataFrame (as does Dataset.show)?
                            
                                spark.sql.shuffle.partitions of 200 default partitions conundrum
                            
                                Ambiguous schema in Spark Scala
                            
                                Capturing the result of explain() in pyspark
                            
                                How to connect master and slaves in Apache-Spark? (Standalone Mode)
                            
                                How to access a web URL using a spark context
                            
                                HDFS file watcher
                            
                                Spark: java.io.IOException: No space left on device
                            
                                How to use Spark SQL DataFrame with flatMap?
                            
                                How to sort an RDD and limit in Spark?
                            
                                pyspark: grouby and then get max value of each group

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With