Can we able to use mulitple sparksessions to access two different Hive servers

Tags:

I have a scenario to compare two different tables source and destination from two separate remote hive servers, can we able to use two SparkSessions something like I tried below:-

 val spark = SparkSession.builder().master("local")
  .appName("spark remote")
  .config("javax.jdo.option.ConnectionURL", "jdbc:mysql://192.168.175.160:3306/metastore?useSSL=false")
  .config("javax.jdo.option.ConnectionUserName", "hiveroot")
  .config("javax.jdo.option.ConnectionPassword", "hivepassword")
  .config("hive.exec.scratchdir", "/tmp/hive/${user.name}")
  .config("hive.metastore.uris", "thrift://192.168.175.160:9083")
  .enableHiveSupport()
  .getOrCreate()

SparkSession.clearActiveSession()
SparkSession.clearDefaultSession()

val sparkdestination = SparkSession.builder()
  .config("javax.jdo.option.ConnectionURL", "jdbc:mysql://192.168.175.42:3306/metastore?useSSL=false")
  .config("javax.jdo.option.ConnectionUserName", "hiveroot")
  .config("javax.jdo.option.ConnectionPassword", "hivepassword")
  .config("hive.exec.scratchdir", "/tmp/hive/${user.name}")
  .config("hive.metastore.uris", "thrift://192.168.175.42:9083")
  .enableHiveSupport()
  .getOrCreate()

I tried with SparkSession.clearActiveSession() and SparkSession.clearDefaultSession() but it isn't working, throwing the error below:

Hive: Failed to access metastore. This class should not accessed in runtime.
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

is there any other way we can achieve accessing two hive tables using multiple SparkSessions or SparkContext.

Thanks

378

asked Jul 06 '17 12:07

Vicky

1 Answers

I use this way and working perfectly fine with Spark 2.1

val sc = SparkSession.builder()
             .config("hive.metastore.uris", "thrift://dbsyz1111:10000")
             .enableHiveSupport()
             .getOrCreate()

// Createdataframe 1 from by reading the data from hive table of metstore 1
val dataframe_1 = sc.sql("select * from <SourcetbaleofMetaStore_1>")

// Resetting the existing Spark Contexts
SparkSession.clearActiveSession()
SparkSession.clearDefaultSession()

//Initialize Spark session2 with Hive Metastore 2
val spc2 = SparkSession.builder()
               .config("hive.metastore.uris", "thrift://dbsyz2222:10004")
               .enableHiveSupport()
               .getOrCreate()

// Load dataframe 2 of spark context 1 into a new dataframe of spark context2, By getting schema and data by converting to rdd  API
val dataframe_2 = spc2.createDataFrame(dataframe_1.rdd, dataframe_1.schema)

dataframe_2.write.mode("Append").saveAsTable(<targettableNameofMetastore_2>)

158

answered Sep 20 '22 01:09

Srinivas Bandaru

Related questions
                            
                                Scala code doesnt fetch s3 file
                            
                                Find name of currently running SparkContext
                            
                                Folding a list of different types using Shapeless in Scala
                            
                                Scala Try's toOption method returns Some(null)
                            
                                How to set mainClass in ScalaJS build.sbt?
                            
                                Function type with receiver in Scala
                            
                                What is non blocking and blocking future in Scala?
                            
                                java.lang.RuntimeException: There is no started application error, when testing a class from scala worksheet
                            
                                Play 2.5 Migration Error: Custom Action with BodyParser: could not find implicit value for parameter mat: akka.stream.Materializer
                            
                                Re-using A Schema from JSON within a Spark DataFrame using Scala
                            
                                Akka Flow hangs when making http requests via connection pool
                            
                                When is ExecutionContext#reportFailure(Throwable) called?
                            
                                Serving Scala.js assets
                            
                                Free implementation in scalaz
                            
                                In Scala, is it possible to "curry" type parameters of a def?
                            
                                How to build Spark from the sources from the Download Spark page?
                            
                                Scala dependency injection when using case class/companion object pattern
                            
                                Cannot load main class from JAR file
                            
                                Slick 3: How to implement repository pattern with transactions?
                            
                                Where was scala_home homebrew installed on OSX?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Can we able to use mulitple sparksessions to access two different Hive servers

Tags:

scala

apache-spark

apache-spark-sql

hive

Vicky

People also ask

1 Answers

Srinivas Bandaru

Recent Activity

Donate For Us