My question is rather simple, but somehow I cannot find a clear answer by reading the documentation.
I have Spark2 running on a CDH 5.10 cluster. There is also Hive and a metastore.
I create a session in my Spark program as follows:
SparkSession spark = SparkSession.builder().appName("MyApp").enableHiveSupport().getOrCreate()
Suppose I have the following HiveQL query:
spark.sql("SELECT someColumn FROM someTable")
I would like to know whether:
I am doing some performance evaluation and I don't know whether I should claim the time performance of queries executed with spark.sql([hiveQL query])
refer to Spark or Hive.
SparkSession was introduced in version Spark 2.0, It is an entry point to underlying Spark functionality in order to programmatically create Spark RDD, DataFrame, and DataSet.
You should always close your SparkSession when you are done with its use (even if the final outcome were just to follow a good practice of giving back what you've been given). Closing a SparkSession may trigger freeing cluster resources that could be given to some other application.
enableHiveSupport () Enables Hive support, including connectivity to a persistent Hive metastore, support for Hive SerDes, and Hive user-defined functions.
getOrCreate () Gets an existing SparkSession or, if there is no existing one, creates a new one based on the options set in this builder. New in version 2.0. 0. This method first checks whether there is a valid global default SparkSession, and if yes, return that one.
Spark knows two catalogs, hive and in-memory. If you set enableHiveSupport()
, then spark.sql.catalogImplementation
is set to hive
, otherwise to in-memory
. So if you enable hive support, spark.catalog.listTables().show()
will show you all tables from the hive metastore.
But this does not mean hive is used for the query*, it just means that spark communicates with the hive-metastore, the execution engine is always spark.
*there are actually some functions like percentile
und percentile_approx
which are native hive UDAF.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With