Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

impyla hangs when connecting to HiveServer2

Tags:

python

hive

I'm writing some ETL flows in Python that, for part of the process, use Hive. Cloudera's impyla client, according to the documentation, works with both Impala and Hive.

In my experience, the client worked for Impala, but hung when I tried to connect to Hive:

from impala.dbapi import connect

conn = connect(host='host_running_hs2_service', port=10000, user='awoolford', password='Bzzzzz')
cursor = conn.cursor()          <- hangs here
cursor.execute('show tables')
results = cursor.fetchall()
print results

If I step-into the code, it hangs when it tries to open a session (line #873 of hiveserver2.py).

At first, I suspected that a firewall port might be blocking the connection, and so I tried to connect using Java. To my surprise, this worked:

public class Main {
    private static String driverName = "org.apache.hive.jdbc.HiveDriver";
    public static void main(String[] args) throws SQLException {
        try {
            Class.forName(driverName);
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
            System.exit(1);
        }
        Connection connection = DriverManager.getConnection("jdbc:hive2://host_running_hs2_service:10000/default", "awoolford", "Bzzzzz");
        Statement statement = connection.createStatement();
        ResultSet resultSet = statement.executeQuery("SHOW TABLES");

        while (resultSet.next()) {
            System.out.println(resultSet.getString(1));
        }
    }
}

Since Hive and Python are such commonly used technologies, I'm curious to know if anyone else has experienced this problem and, if so, what did you do to fix it?

Versions:

  • Hive 1.1.0-cdh5.5.1
  • Python 2.7.11 | Anaconda 2.3.0
  • Redhat 6.7
like image 431
Alex Woolford Avatar asked Mar 07 '16 21:03

Alex Woolford


1 Answers

/path/to/bin/hive --service hiveserver2 --hiveconf hive.server2.authentication=NOSASL

from impala.dbapi import connect

conn = connect(host='host_running_hs2_service', port=10000, user='awoolford', password='Bzzzzz', auth_mechanism='NOSASL')
cursor = conn.cursor()
cursor.execute('show tables')
results = cursor.fetchall()
print results
like image 93
戚锦秀 Avatar answered Oct 07 '22 15:10

戚锦秀