I was using hive jdbc but after that I came to know that there is hive metastore java api (here) by which you can again connect to hive and manipulate hive database.
But I was wondering that what exactly is the difference between these two ways.
Sorry if asked anything obvious but any information will be highly appreciated.
What is Hive Metastore? Metastore is the central repository of Apache Hive metadata. It stores metadata for Hive tables (like their schema and location) and partitions in a relational database. It provides client access to this information by using metastore service API.
The Hive ODBC client provides a set of C-compatible library functions to interact with Hive Server in a pattern similar to those dictated by the ODBC specification. See Hive ODBC Driver.
Hive JDBC Connector 2.6. The Cloudera JDBC Driver for Hive enables your enterprise users to access Hadoop data through Business Intelligence (BI) applications with JDBC support. The driver achieves this by translating calls from the application into SQL and passing the SQL queries to the underlying Hive engine.
Hive provides a JDBC connection URL string jdbc:hive2://ip-address:port to connect to Hive warehouse from remote applications running with Java , Scala , Python , Spark and many more.
as far as I understand there are 2 ways to connect to Hive
Now, in the earlier editions of hive, hiveserver2 used to be not so stable and in fact it's multi-threading support was also limited. Things have probably improved in that arena, I'd imagine.
So for JDBC api - yes, it would let you communicate using JDBC and sql.
For the metastore connectivity, there appear to be 2 features.
DDL -
for DDL, the metastore APIs come in handy, org.apache.hadoop.hive.metastore.HiveMetaStoreClient HiveMetaStoreClient class can be utilized for that purpose
DML -
what I have found useful in this regard is the org.apache.hadoop.hive.ql.Driver https://hive.apache.org/javadocs/r0.13.1/api/ql/org/apache/hadoop/hive/ql/Driver.html hive.ql.Driver class
This class has a method called run()
which lets you execute a SQL statement and get the result back.
for e.g. you can do following
Driver driver = new Driver(hiveConf);
HiveMetaStoreClient client = new HiveMetaStoreClient(hiveConf);
SessionState.start(new CliSessionState(hiveConf));
driver.run("select * from employee);
// DDL example
client.dropTable(db, table);
metastore in hive as the name indicates is a store for hive db's metadata. This store is usually an RDBMS. The metastore api supports interacting with the RDBMS to tinker/tweak the metadata and not the actual hive db/data.For normal usage you may never want/have to use these.I would think that these are meant for people working on creating toolsets to work with the metastore and not for normal day to day usage.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With