Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to install packages in Airflow?

Tags:

pip

airflow

I deployed a dag in Airflow (on GCP) but I receive error "No module named 'scipy'". How do I install packages in Airflow?

I've tried adding a separate DAG to run

def pip_install(package):
    subprocess.call([sys.executable, "-m", "pip", "install", package])


def update_packages(**kwargs):
    logging.info(list(sys.modules.keys()))
    for package in PACKAGES:
        pip_install(package)

I've tried writing pip3 install scipy on the shell of GCP;

I've tried adding pip install scipy to the image builder.

None of these approaches had any result.

like image 575
Carlo Avatar asked Oct 29 '19 16:10

Carlo


People also ask

How do I install a package in Python?

To install a package that includes a setup.py file, open a command or terminal window and: cd into the root directory where setup.py is located. Enter: python setup.py install.

How do I set dependencies in Airflow?

Basic dependencies between Airflow tasks can be set in the following ways: Using bitshift operators ( << and >> ) Using the set_upstream and set_downstream methods.


1 Answers

If you are using Cloud Composer on GCP, you should check https://cloud.google.com/composer/docs/how-to/using/installing-python-dependencies

Pass a requirements.txt file to the gcloud command-line tool. Format the file with each requirement specifier on a separate line.

Sample requirements.txt file:

scipy>=0.13.3
scikit-learn
nltk[machine_learning]

Pass the requirements.txt file to the gcloud command to set your installation dependencies.

gcloud composer environments update ENVIRONMENT-NAME \\
--update-pypi-packages-from-file requirements.txt \\
--location LOCATION
like image 122
kaxil Avatar answered Oct 11 '22 19:10

kaxil