OK, I am probably very stupid but anyways; How can I install additional pip packages via the docker-compose file of airflow?
I am assuming that their should be a standard functionality to pick up a requirements.txt
or something. When inspecting their repo, I do see some ENV variables like ADDITIONAL_PYTHON_DEPS
that hint me that this should be possible, but setting these in the docker-compose file doesn't actually install the library's.
version: '3'
x-airflow-common:
&airflow-common
image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.0.1}
environment:
&airflow-common-env
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
AIRFLOW__WEBSERVER__EXPOSE_CONFIG: 'true'
ADDITIONAL_PYTHON_DEPS: python-bitvavo-api
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
- ./requirements.txt:/requirements.txt
Obviously my docker experience is very limited but what am I missing?
To run and open . yml files you have to install Docker Compose. After the installation, go to your docker-compose. yml directory and then execute docker-compose up to create and start services in your docker-compose.
In order for running Airflow in Docker, you need to download Docker and Docker compose then start your container after that you can create your own DAG and schedule the tasks or trigger it. Now you can create your own DAGs and run them in Docker.
DO NOT expect the Docker Compose below will be enough to run production-ready Docker Compose Airflow installation using it. This is truly quick-start docker-compose for you to get Airflow up and running locally and get your hands dirty with Airflow.
airflow image contains almost enough PIP packages for operating, but we still need to install extra packages such as clickhouse-driver, pandahouse and apache-airflow-providers-slack. Airflow from 2.1.1 supports ENV _PIP_ADDITIONAL_REQUIREMENTS to add additional requirements when starting all containers
Got the answer at airflow GitHub discussions. The only way now to install extra python packages to build your own image. I will try to explain this solution in more details Step 1. Put Dockerfile, docker-compose.yaml and requirements.txt files to the project directory Step 2. Paste to Dockefile code below:
Step 1. Put Dockerfile, docker-compose.yaml and requirements.txt files to the project directory Step 2. Paste to Dockefile code below: FROM apache/airflow:2.1.0 COPY requirements.txt . RUN pip install -r requirements.txt Step 3. Paste to docker-compose.yaml code, which you can find in the official documentation.
There is a pretty detailed guide on how to achieve what you are looking for on the Airflow docs here. Depending on your requirements, this may be as easy as extending the original image using a From
directive while creating a new Dockerfile, or you may need to customize the image to suit your needs.
If you go with the Extending the image approach your new Dockerfile will be something like this:
FROM apache/airflow:2.0.1
USER root
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
build-essential my-awesome-apt-dependency-to-add \
&& apt-get autoremove -yqq --purge \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
USER airflow
RUN pip install --no-cache-dir --user my-awesome-pip-dependency-to-add
Then you could just add something like these to the docker-compose file:
...
version: "3"
x-airflow-common: &airflow-common
build: . # this is optional
image: ${AIRFLOW_IMAGE_NAME:-the_name_of_your_extended_image
...
...
Finally, build your image and turn everything back on using compose. Try the docs for details or a full explanation. Hope that works for you!
Extending the image could be one way. Another way is adding the package in the docker compose.
For example, you want to pip apache-airflow-providers-apache-hdfs. Then you go to docker compose file,
x-airflow-common:
&airflow-common
image: airflow_melodie1:test
environment:
&airflow-common-env
......
_PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:- apache-airflow-providers-apache-hdfs other_packages}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With