Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Spark Thrift server is related to Apache Thirft

Tags:

I read post on quora which tell that Spark Thrift server is related to Apache Thrift which is d binary communication protocol. Spark Thrift server is the interface to Hive, but how does Spark Thrift server use Apache Thrift for communication with Hive via binary protocol/rpc?

like image 472
pacman Avatar asked Aug 14 '17 05:08

pacman


People also ask

What is Thrift server in spark?

Spark Thrift server is a service that allows JDBC and ODBC clients to run Spark SQL queries. The Spark Thrift server is a variant of HiveServer2.

What is thrift server used for?

Thrift is an interface definition language and binary communication protocol used for defining and creating services for numerous programming languages. It was developed at Facebook for "scalable cross-language services development" and as of 2020 is an open source project in the Apache Software Foundation.

What is Thrift server in Hadoop?

HiveServer is an optional service that allows a remote client to submit requests to Hive, using a variety of programming languages, and retrieve results.


1 Answers

Spark Thrift Server is a Hive-compatible interface for Spark.

That means, it creates implementation of HiveServer2, you can connect with beeline, however almost all the computation will be computed with Spark, not Hive.

In the previous versions, query parser was from Hive. Currently Spark Thrift Server works with Spark query parser.

Apache Thrift is a framework to develop RPC - Remote Procedure Calls - so there are many implementations using Thrift. Also Cassandra used Thrift, now it's replaced with Cassandra native protocol.

So, Apache Thrift is a framework to develop RPCs, Spark Thrift Server is an implementation of Hive protol, but it uses Spark as a computation framework.

For more details, please see this link from @RussS

like image 114
T. Gawęda Avatar answered Sep 30 '22 15:09

T. Gawęda