Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Airflow: Unable to access the AWS providers

I'm trying to access the Airflow Providers, specifically the AWS providers, found here

I'm building a docker image and installing Airflow using PIP and including the AWS subpackage in the install command.

pip install 'apache-airflow[crypto,aws,celery,postgres,hive,jdbc,mysql,ssh]==1.10.9' \

However, i'm unable to access the Provider from Python.

from airflow.providers.amazon.aws.hooks.glue import AwsGlueJobHook
>>> from airflow.providers.amazon.aws.hooks.glue import *
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'airflow.providers'

Providers folder has not been installed.

total 184
-rw-r--r--  1 root root   833 Apr 17 15:25 version.py
-rw-r--r--  1 root root 13682 Apr 17 15:25 settings.py
-rw-r--r--  1 root root  5281 Apr 17 15:25 sentry.py
-rw-r--r--  1 root root  8942 Apr 17 15:25 plugins_manager.py
-rw-r--r--  1 root root  3833 Apr 17 15:25 logging_config.py
-rw-r--r--  1 root root  3232 Apr 17 15:25 __init__.py
-rw-r--r--  1 root root  3503 Apr 17 15:25 exceptions.py
-rw-r--r--  1 root root  2646 Apr 17 15:25 default_login.py
-rw-r--r--  1 root root 26086 Apr 17 15:25 configuration.py
-rw-r--r--  1 root root  2237 Apr 17 15:25 alembic.ini
drwxr-xr-x  6 root root  4096 Apr 17 15:25 www_rbac
drwxr-xr-x  6 root root  4096 Apr 17 15:25 www
drwxr-xr-x  5 root root  4096 Apr 17 15:25 _vendor
drwxr-xr-x  4 root root  4096 Apr 17 15:25 utils
drwxr-xr-x  4 root root  4096 Apr 17 15:25 ti_deps
drwxr-xr-x  4 root root  4096 Apr 17 15:25 task
drwxr-xr-x  3 root root  4096 Apr 17 15:25 serialization
drwxr-xr-x  3 root root  4096 Apr 17 15:25 sensors
drwxr-xr-x  3 root root  4096 Apr 17 15:25 security
drwxr-xr-x  2 root root  4096 Apr 17 15:25 __pycache__
drwxr-xr-x  3 root root  4096 Apr 17 15:25 operators
drwxr-xr-x  3 root root  4096 Apr 17 15:25 models
drwxr-xr-x  4 root root  4096 Apr 17 15:25 migrations
drwxr-xr-x  3 root root  4096 Apr 17 15:25 macros
drwxr-xr-x  4 root root  4096 Apr 17 15:25 lineage
drwxr-xr-x  3 root root  4096 Apr 17 15:25 jobs
drwxr-xr-x  3 root root  4096 Apr 17 15:25 hooks
drwxr-xr-x  3 root root  4096 Apr 17 15:25 executors
drwxr-xr-x  4 root root  4096 Apr 17 15:25 example_dags
drwxr-xr-x  3 root root  4096 Apr 17 15:25 dag
drwxr-xr-x 12 root root  4096 Apr 17 15:25 contrib
drwxr-xr-x  3 root root  4096 Apr 17 15:25 config_templates
drwxr-xr-x  3 root root  4096 Apr 17 15:25 bin
drwxr-xr-x  6 root root  4096 Apr 17 15:25 api
airflow@eaf772874a0b:/usr/local/lib/python3.7/site-packages/airflow$

Any help is greatly appreciated.

like image 538
PaulSyl1980 Avatar asked Apr 17 '20 15:04

PaulSyl1980


People also ask

What is AWS connection ID on airflow?

The default connection ID is aws_default . If the environment/machine where you are running Airflow has the file credentials in /home/. aws/ , and the default connection has user and pass fields empty, it will take automatically the credentials from there.


2 Answers

Providers package is no longer included with Airflow, but you can separately install them with pip using the specific backport package, for aws you can use:

pip install apache-airflow-backport-providers-amazon

More info can be found here: Airflow Amazon Provider

like image 147
wichon Avatar answered Oct 07 '22 16:10

wichon


Providers package is currently only for Airflow Master branch.

If you want to check what operators are available in your version, either check code for that specific version, example: https://github.com/apache/airflow/tree/1.10.10 or https://airflow.apache.org/docs/1.10.10/_api/index.html

like image 1
kaxil Avatar answered Oct 07 '22 18:10

kaxil