We are trying to move from Pentaho Kettle, to Apache AIrflow to do ETL and centralize all data processes under 1 tool.
We use Kettle to daily read data from Postgres/Mysql databases, and move the data to S3 -> Redshift.
What is the easiest way to do this? I do not see Operator that could directly do this; so Should i use MySQL/Postgres operator to put data in a local file, and the use S3 operator to move data to S3?
Thank you
Configuring the ConnectionSpecify the schema name to be used in the database. Specify the user name to connect. Specify the password to connect. Specify the extra parameters (as json dictionary) that can be used in postgres connection.
Airflow was built to interact with its metadata using SqlAlchemy with MySQL, Postgres and SQLite as supported backends (SQLite is used primarily for development purpose).
Airflow is essentially a graph (Directed Acyclic Graph) made up of tasks (nodes) and dependencies (edges). A task defined or implemented by a operator is a unit of work in your data pipeline. The purpose of Postgres Operator is to define tasks involving interactions with a PostgreSQL database.
You can build your own operator 'mysql_to_s3' and add it as a plugin to Airflow.
There is an operator to archive data from Mysql to gcs:
mysql_to_gcs.py
You can let all the code with a little change on def _upload_to_gcs
using s3_hook instead: s3_hook.py.
Documentation about custom plugins:
Airflow plugins: Blog article
Airflow plugins: Official documentation
airflow-plugins (by Astronomer) has a MySqlToS3Operator
that will take the resultset of a mysql query and place it on s3 as either csv or json.
The plugin can be found here: https://github.com/airflow-plugins/mysql_plugin/blob/master/operators/mysql_to_s3_operator.py
From there you might be able to use the s3_to_redshift operator to load data from S3 into redshift: https://airflow.readthedocs.io/en/latest/_modules/airflow/operators/s3_to_redshift_operator.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With