Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Not able to fetch result from hive transaction enabled table through spark-sql

Background:-

  • I am using HDP with spark1.6.0 and hive 1.2.1

Steps Followed:-

Create a hive table:-

hive>
CREATE TABLE orctest(PROD_ID bigint, CUST_ID bigint, TIME_ID timestamp, CHANNEL_ID bigint, PROMO_ID bigint, QUANTITY_SOLD decimal(10,0), AMOUNT_SOLD decimal(10,0) ) CLUSTERED BY (PROD_ID) INTO 32 BUCKETS STORED AS ORC TBLPROPERTIES ( "orc.compress"="SNAPPY", "transactional"="true" );

Insert records into orctest

hive>
insert into orctest values(1, 1, '2016-08-02 21:36:54.000000000', 1, 1, 10, 10000);

Try to access the orctest table from spark-shell

scala>
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)

val s = hiveContext.table("orctest")*

Exception Thrown:-

16/08/02 22:06:54 INFO OrcRelation: Listing hdfs://hadoop03:8020/apps/hive/warehouse/orctest on driver
16/08/02 22:06:54 
INFO OrcRelation: Listing hdfs://hadoop03:8020/apps/hive/warehouse/orctest/delta_0000005_0000005 on driver
**java.lang.AssertionError: assertion failed**
at scala.Predef$.assert(Predef.scala:165)
at org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:39)
at org.apache.spark.sql.execution.datasources.LogicalRelation$$anonfun$1.apply(LogicalRelation.scala:38)
at scala.Option.map(Option.scala:145)
at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38)
at org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31)
at org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$$convertToOrcRelation(HiveMetastoreCatalog.scala:588)

Any help will be really appreciated.

like image 999
Priyanka.Patil Avatar asked Aug 03 '16 10:08

Priyanka.Patil


People also ask

Can Spark SQL be executed in Hive?

Spark SQL supports queries that are written using HiveQL, a SQL-like language that produces queries that are converted to Spark jobs. The Spark DataFrame API encapsulates data sources, including DataStax Enterprise data, organized into named columns.

How do I enable Hive support in Spark?

to connect to hive metastore you need to copy the hive-site. xml file into spark/conf directory. After that spark will be able to connect to hive metastore.

Does Hive work with Spark?

Spark SQL is designed to be compatible with the Hive Metastore, SerDes and UDFs. Currently, Hive SerDes and UDFs are based on Hive 1.2. 1, and Spark SQL can be connected to different versions of Hive Metastore (from 0.12. 0 to 2.3.


1 Answers

Try setting : hiveContext.setConf("spark.sql.hive.convertMetastoreOrc", "false")

like image 189
sandyyyy Avatar answered Nov 15 '22 01:11

sandyyyy