Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Airflow connection to MySQL

Tags:

mysql

airflow

I would like do some ad hoc queries with my mysql database currently sitting on aws rds. I created a connection with all of the necessary credentials on the Airflow UI however the database did not show up under the Data Profiling>Ad hoc Query section.

Any help is appreciated. Thanks!

like image 297
Sau Rieng Avatar asked Feb 12 '18 14:02

Sau Rieng


People also ask

Does Airflow need a database?

Overview. The metadata database is a core component of Airflow. It stores crucial information like the configuration of your Airflow environment's roles and permissions, as well as all metadata for past and present DAG and task runs. A healthy metadata database is critical for your Airflow environment.

What DB does Airflow use?

Choosing database backend By default, Airflow uses SQLite, which is intended for development purposes only.


1 Answers

For the original question, the OP may simply need to install a python-mysql adapter.

I just ran into a similar issue.

For me, this issue was due to a lack of dependencies installed on my system.

As I was trying to connect to a Postgres database, I installed the python-postgres adapter, psycopg2:

pip install psycopg2

I restarted the Airflow web server, and Postgres connections began to populate within the Ad Hoc Query drop-down.

Here's how I identified this issue.

I was having this same problem trying to get a connection to an RDS Postgres server appear in the AdHoc query drop-down. After duplicating the existing sqlite_default connection, the drop-down appeared to include only Sqlite connections. This was with a near-vanilla default Airflow configuration. It seemed the connection was not listed because db.get_hook() returned None.

Stepping deeper into the code, I was able to identify that from airflow.hooks.postgres_hook import PostgresHook was failing with an error like:

*** ImportError: No module named 'psycopg2'

Using an interactive python debugger (eg. Pdb, via import pdb; pdb.set_trace()), the OP may find a similar error message, ie:

(Pdb) from airflow.hooks.mysql_hook import MySqlHook *** ImportError: No module named 'MySQLdb'

like image 92
RUN-CMD Avatar answered Sep 18 '22 03:09

RUN-CMD