Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Connect Hive through Java JDBC

Tags:

hadoop

hive

There is a question here connect from java to Hive but mine is different

My hive running on machine1 and I need to pass some queries using Java server running at machine2. As I understand Hive has a JDBC interface for the purpose of receiving remote queries. I took the code from here - HiveServer2 Clients

I installed the dependencies written in the article:

  1. hive-jdbc*.jar
  2. hive-service*.jar
  3. libfb303-0.9.0.jar
  4. libthrift-0.9.0.jar
  5. log4j-1.2.16.jar
  6. slf4j-api-1.6.1.jar
  7. slf4j-log4j12-1.6.1.jar
  8. commons-logging-1.0.4.jar

However I got java.lang.NoClassDefFoundError error at compile time Full Error:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
    at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:393)
    at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:187)
    at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:163)
    at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
    at java.sql.DriverManager.getConnection(DriverManager.java:571)
    at java.sql.DriverManager.getConnection(DriverManager.java:215)
    at com.bidstalk.tools.RawLogsQuerySystem.HiveJdbcClient.main(HiveJdbcClient.java:25)

Another question at StackOverflow recommended to add Hadoop API dependencies in Maven - Hive Error

I don't understand why do I need hadoop API for a client to connect with Hive. Shouldn't JDBC driver be agnostic of the underlying query system? I just need to pass some SQL query?

Edit: I am using Cloudera(5.3.1), I think I need to add CDH dependencies. Cloudera instance is running hadoop 2.5.0 and HiveServer2

But the servers are at machine 1. On machine the code should at least compile and I should have issues at runtime only!

like image 985
Mangat Rai Modi Avatar asked Feb 27 '15 08:02

Mangat Rai Modi


1 Answers

Seems like you are all working with cloudera, I found that the repo in maven looks old because if you go to their site, you can download their jdbc. https://www.cloudera.com/downloads/connectors/hive/jdbc/2-5-20.html The driver seems to support more functionality than the one in hive. I notice that that they have addBatch implemented. I just wish they had these libraries in maven. Maybe someone can find where to get them from using maven.

like image 95
pitchblack408 Avatar answered Sep 18 '22 13:09

pitchblack408