I'm new to hadoop and impala. I managed to connect to impala by installing impyla and executing the following code. This is connection by LDAP:
from impala.dbapi import connect
from impala.util import as_pandas
conn = connect(host="server.lrd.com",port=21050, database='tcad',auth_mechanism='PLAIN', user="alexcj", use_ssl=True,timeout=20, password="secret1pass")
I'm then able to grab a cursor and execute queries as:
cursor = conn.cursor()
cursor.execute('SELECT * FROM tab_2014_m LIMIT 10')
df = as_pandas(cursor)
I'd like to be able use sqlalchemy to connect to impala and be able to use some nice sqlalchemy functions. I found a test file in imyla source code that illustrates how to create an sqlalchemy engine with impala driver like:
engine = create_engine('impala://localhost')
I'd like to be able to do that but I'm not able to because my call to the connect function above has a lot more parameters; and I do not know how to pass those to sqlalchemy's create_engine to get a successful connection. Has anyone done this? Thanks.
As explained at https://github.com/cloudera/impyla/issues/214
import sqlalchemy
def conn():
return connect(host='some_host',
port=21050,
database='default',
timeout=20,
use_ssl=True,
ca_cert='some_pem',
user=user, password=pwd,
auth_mechanism='PLAIN')
engine = sqlalchemy.create_engine('impala://', creator=conn)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With