I'm running into an issue when trying to create a table. Here is the code to create the table, where the exception is occurring: <pre class="prettyprint"><code>sparkSession.sql(s"CREATE TABLE IF NOT EXISTS mydatabase.students(" + s"name string," + s"age int)") </code></pre> Here is the spark session configuration: <pre class="prettyprint"><code>lazy val sparkSession = SparkSession .builder() .appName("student_mapping") .enableHiveSupport() .getOrCreate() </code></pre> And this is the exception: <pre class="prettyprint"><code>org.apache.spark.sql.AnalysisException: Hive support is required to CREATE Hive TABLE (AS SELECT);;'CreateTable `mydatabase`.`students`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Ignore </code></pre> My question is: Why is this exception occurring? I have several other spark programs running with the same session configurations, running flawlessly. I'm using Scala 2.11 and Spark 2.3.

<blockquote> SparkSession is the entry point to Spark SQL. It is one of the very first objects you create while developing a Spark SQL application. SessionState is the state separation layer between Spark SQL sessions, including SQL configuration, tables, functions, UDFs, SQL parser, and everything else that depends on a SQLConf. SessionState is available as the sessionState property of a SparkSession Internally, sessionState clones the optional parent SessionState (if given when creating the SparkSession) or creates a new SessionState using BaseSessionStateBuilder as defined by spark.sql.catalogImplementation configuration property: in-memory (default) for org.apache.spark.sql.internal.SessionStateBuilder hive for org.apache.spark.sql.hive.HiveSessionStateBuilder </blockquote> For using hive you should use the class <code>org.apache.spark.sql.hive.HiveSessionStateBuilder</code> and according to the document this can be done by setting the property <code>spark.sql.catalogImplementation</code> to <code>hive</code> when creating SparkSession object: <pre class="prettyprint"><code>val conf = new SparkConf .set("spark.sql.warehouse.dir", "hdfs://namenode/sql/metadata/hive") .set("spark.sql.catalogImplementation","hive") .setMaster("local[*]") .setAppName("Hive Example") val spark = SparkSession.builder() .config(conf) .enableHiveSupport() .getOrCreate() </code></pre> or you can pass the property <code>--conf spark.sql.catalogImplementation=hive</code> when you submit your job to the cluster.

Why do I get a “Hive support is required to CREATE Hive TABLE (AS SELECT)” error when creating a table?

Tags:

scala

apache-spark

hive

I'm running into an issue when trying to create a table.

Here is the code to create the table, where the exception is occurring:

sparkSession.sql(s"CREATE TABLE IF NOT EXISTS mydatabase.students(" +
s"name string," + s"age int)")

Here is the spark session configuration:

lazy val sparkSession = SparkSession
.builder()
.appName("student_mapping")
.enableHiveSupport()
.getOrCreate()

And this is the exception:

org.apache.spark.sql.AnalysisException: Hive support is required to 
CREATE Hive TABLE (AS SELECT);;'CreateTable `mydatabase`.`students`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Ignore

My question is: Why is this exception occurring? I have several other spark programs running with the same session configurations, running flawlessly. I'm using Scala 2.11 and Spark 2.3.

805

asked Jun 18 '18 16:06

G Lor

1 Answers

SparkSession is the entry point to Spark SQL. It is one of the very first objects you create while developing a Spark SQL application.

SessionState is the state separation layer between Spark SQL sessions, including SQL configuration, tables, functions, UDFs, SQL parser, and everything else that depends on a SQLConf.

SessionState is available as the sessionState property of a SparkSession

Internally, sessionState clones the optional parent SessionState (if given when creating the SparkSession) or creates a new SessionState using BaseSessionStateBuilder as defined by spark.sql.catalogImplementation configuration property:

in-memory (default) for org.apache.spark.sql.internal.SessionStateBuilder

hive for org.apache.spark.sql.hive.HiveSessionStateBuilder

For using hive you should use the class org.apache.spark.sql.hive.HiveSessionStateBuilder and according to the document this can be done by setting the property spark.sql.catalogImplementation to hive when creating SparkSession object:

val conf = new SparkConf
      .set("spark.sql.warehouse.dir", "hdfs://namenode/sql/metadata/hive")
      .set("spark.sql.catalogImplementation","hive")
      .setMaster("local[*]")
      .setAppName("Hive Example")

val spark = SparkSession.builder()
      .config(conf)
      .enableHiveSupport()
      .getOrCreate()

or you can pass the property --conf spark.sql.catalogImplementation=hive when you submit your job to the cluster.

answered Oct 23 '22 09:10

Soheil Pourbafrani

Related questions
                            
                                How to run sbt tests for debugging when debug is disabled by default?
                            
                                SparkSQL MissingRequirementError when registering table
                            
                                Inserting default values if column value is 'None' using slick
                            
                                What are Tower[A] and IvoryTower in Scalaz?
                            
                                Multiple SLF4J bindings with Play 2.3.8
                            
                                Shuffling Range in Scala is Odd
                            
                                How to flatMap a function on GroupedDataSet in Apache Flink
                            
                                Why `Random.nextInt` is considered not ' composable, modular, easily parallelized'
                            
                                Why does a large array constructor call break the Scala compiler?
                            
                                Some questions about difference between call-by-name and 0-arity functions
                            
                                How to get Histogram of all columns in a large CSV / RDD[Array[double]] using Apache Spark Scala?
                            
                                Difference between Apache spark mllib.linalg vectors and spark.util vectors for machine learning
                            
                                covariance and variance flip in scala
                            
                                gradle-android-scala-plugin gives "Could not find matching constructor" error
                            
                                How can I provide SBT credentials to my private Artifactory server from a Windows workstation?
                            
                                How to cast a WrappedArray[WrappedArray[Float]] to Array[Array[Float]] in spark (scala)
                            
                                Convert Any to Integer in Scala?
                            
                                Reading JSON files into Spark Dataset and adding columns from a separate Map
                            
                                Spark 2.0 Timestamp Difference in Milliseconds using Scala
                            
                                Setting an environment variable from within the sbt shell

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With