Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Airflow: how to get pip packages installed via their docker-compose.yml?

OK, I am probably very stupid but anyways; How can I install additional pip packages via the docker-compose file of airflow?

I am assuming that their should be a standard functionality to pick up a requirements.txt or something. When inspecting their repo, I do see some ENV variables like ADDITIONAL_PYTHON_DEPS that hint me that this should be possible, but setting these in the docker-compose file doesn't actually install the library's.

version: '3'
x-airflow-common:
  &airflow-common
  image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.0.1}
  environment:
    &airflow-common-env
    AIRFLOW__CORE__EXECUTOR: CeleryExecutor
    AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
    AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
    AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
    AIRFLOW__CORE__FERNET_KEY: ''
    AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
    AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
    AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
    AIRFLOW__WEBSERVER__EXPOSE_CONFIG: 'true'
    ADDITIONAL_PYTHON_DEPS: python-bitvavo-api

volumes:
    - ./dags:/opt/airflow/dags
    - ./logs:/opt/airflow/logs
    - ./plugins:/opt/airflow/plugins
    - ./requirements.txt:/requirements.txt

Obviously my docker experience is very limited but what am I missing?

like image 359
Bart Avatar asked Mar 18 '21 21:03

Bart


People also ask

How do I load a docker compose Yml?

To run and open . yml files you have to install Docker Compose. After the installation, go to your docker-compose. yml directory and then execute docker-compose up to create and start services in your docker-compose.

Can Airflow run Docker container?

In order for running Airflow in Docker, you need to download Docker and Docker compose then start your container after that you can create your own DAG and schedule the tasks or trigger it. Now you can create your own DAGs and run them in Docker.

Can I run Docker Compose with airflow?

DO NOT expect the Docker Compose below will be enough to run production-ready Docker Compose Airflow installation using it. This is truly quick-start docker-compose for you to get Airflow up and running locally and get your hands dirty with Airflow.

What Pip packages are supported by airflow?

airflow image contains almost enough PIP packages for operating, but we still need to install extra packages such as clickhouse-driver, pandahouse and apache-airflow-providers-slack. Airflow from 2.1.1 supports ENV _PIP_ADDITIONAL_REQUIREMENTS to add additional requirements when starting all containers

How to install extra Python packages to build your own Docker image?

Got the answer at airflow GitHub discussions. The only way now to install extra python packages to build your own image. I will try to explain this solution in more details Step 1. Put Dockerfile, docker-compose.yaml and requirements.txt files to the project directory Step 2. Paste to Dockefile code below:

How to add dockerfile to dockerfile using pip?

Step 1. Put Dockerfile, docker-compose.yaml and requirements.txt files to the project directory Step 2. Paste to Dockefile code below: FROM apache/airflow:2.1.0 COPY requirements.txt . RUN pip install -r requirements.txt Step 3. Paste to docker-compose.yaml code, which you can find in the official documentation.


Video Answer


2 Answers

There is a pretty detailed guide on how to achieve what you are looking for on the Airflow docs here. Depending on your requirements, this may be as easy as extending the original image using a From directive while creating a new Dockerfile, or you may need to customize the image to suit your needs.

If you go with the Extending the image approach your new Dockerfile will be something like this:

FROM apache/airflow:2.0.1
USER root
RUN apt-get update \
  && apt-get install -y --no-install-recommends \
         build-essential my-awesome-apt-dependency-to-add \
  && apt-get autoremove -yqq --purge \
  && apt-get clean \
  && rm -rf /var/lib/apt/lists/*
USER airflow
RUN pip install --no-cache-dir --user my-awesome-pip-dependency-to-add

Then you could just add something like these to the docker-compose file:

...
version: "3"
x-airflow-common: &airflow-common
  build: . # this is optional
  image: ${AIRFLOW_IMAGE_NAME:-the_name_of_your_extended_image
  ...
...

Finally, build your image and turn everything back on using compose. Try the docs for details or a full explanation. Hope that works for you!

like image 156
NicoE Avatar answered Oct 26 '22 23:10

NicoE


Extending the image could be one way. Another way is adding the package in the docker compose.

For example, you want to pip apache-airflow-providers-apache-hdfs. Then you go to docker compose file,

x-airflow-common:
  &airflow-common
  image: airflow_melodie1:test
  environment:
    &airflow-common-env
   ......
    _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:- apache-airflow-providers-apache-hdfs other_packages}
like image 32
Yucci Mel Avatar answered Oct 27 '22 01:10

Yucci Mel