Difference between hive thrift server from hive and spark distributions

Question

What's the difference between running hive server using either of the following two commands :-

hive --service hiveserver2
Running hive thrift server from spark/sbin$ ./start-thriftserver.sh

Do they listen on separate ports?

Which one should I use to establish a JDBC connection using Apache Hive JDBC driver in my Java class?

jonathanChap · Accepted Answer

Hiveserver2 is the hive sql engine which can use map reduce, spark or tez as the execution engine. Hive creates the execution plan and then invokes the execution engine to run the query. The optimisation is done by hive.

I am a heavy spark user, but wanted hive available to run adhoc queries through hue. After some research I can see that hive 1.2.1 supports upto spark 1.4.1 as the execution engine. hive 2 has a dependency to spark 1.5 but I have not tried to run it with 1.5 or 1.6.

The spark thrift server can replace hive server 2, and uses spark to actually run the query and do its own execution plan (which may or may not be better than hive), but gives you access to other spark sources such as rdds, text files etc. Of course, you can run the thrift server with the latest version of spark.

vikas · Answer

I guess both do the same except when you start Hive Thrift server from spark, it adds one more CLI service to the thrift server which should add spark SQL context to the thrift API.

Difference between hive thrift server from hive and spark distributions

Tags:

java

jdbc

thrift

hadoop

hive

BludShot

2 Answers

jonathanChap

vikas

Recent Activity

Donate For Us

Difference between hive thrift server from hive and spark distributions

Tags:

java

jdbc

thrift

hadoop

hive

BludShot

2 Answers

jonathanChap

vikas

Related questions

Recent Activity

Donate For Us