sparkSession/sparkContext can not get hadoop configuration

Tags:

apache-spark

hadoop

I am running spark 2, hive, hadoop at local machine, and I want to use spark sql to read data from hive table.

It works all fine when I have hadoop running at default hdfs://localhost:9000, but if I change to a different port in core-site.xml:

<name>fs.defaultFS</name>
<value>hdfs://localhost:9099</value>

Running a simple sql spark.sql("select * from archive.tcsv3 limit 100").show(); in spark-shell will give me the error:

ERROR metastore.RetryingHMSHandler: AlreadyExistsException(message:Database default already exists)
.....
From local/147.214.109.160 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused;
.....

I get the AlreadyExistsException before, which doesn't seem to influence the result.

I can make it work by creating a new sparkContext:

import org.apache.spark.SparkContext
import org.apache.spark.sql.SparkSession
sc.stop()
var sc = new SparkContext()
val session = SparkSession.builder().master("local").appName("test").enableHiveSupport().getOrCreate()
session.sql("show tables").show()

My question is, why the initial sparkSession/sparkContext did not get the correct configuration? How can I fix it? Thanks!

334

asked Aug 19 '16 11:08

xubuild

1 Answers

If you are using SparkSession and you want to set configuration on the the spark context then use session.sparkContext

val session = SparkSession
  .builder()
  .appName("test")
  .enableHiveSupport()
  .getOrCreate()
import session.implicits._

session.sparkContext.hadoopConfiguration.set("fs.s3.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem")

You don't need to import SparkContext or created it before the SparkSession

answered Sep 28 '22 06:09

Jeremy Sanecki

Related questions
                            
                                Write pandas table to impala
                            
                                Remove Empty Partitions from Spark RDD
                            
                                Spark 1.5.2 and SLF4J StaticLoggerBinder
                            
                                Kafka -> Flink DataStream -> MongoDB
                            
                                Spark Shell - __spark_libs__.zip does not exist
                            
                                Integrate key-value database with Spark
                            
                                Error in hadoop jobs due to hive query error
                            
                                hadoop map reduce job with HDFS input and HBASE output
                            
                                SQOOP SQLSERVER Failed to load driver " appropriate connection manager is not being set"
                            
                                what are the differences zookeeper, journal node tasks and quorum journal manager in hadoop?
                            
                                Spark Job running on Yarn Cluster java.io.FileNotFoundException: File does not exits , eventhough the file exits on the master node
                            
                                Working with input splits(HADOOP)
                            
                                What is the meaning of EOF exceptions in hadoop namenode connections from hbase/filesystem?
                            
                                Hadoop HDFS - Cannot connect to port on master
                            
                                problems running simple map-reduce hadoop examples in cygwin
                            
                                Join of two datasets in Mapreduce/Hadoop
                            
                                Differences between hadoop jar and yarn -jar
                            
                                Send KafkaProducer from local machine to hortonworks sandbox on virtualbox
                            
                                Setting spark classpaths on EC2: spark.driver.extraClassPath and spark.executor.extraClassPath
                            
                                winutils.exe chmod command doesn't set permission

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With