I want to deploy Airflow on Kubernetes where pods have access to the same DAGs, in a Shared Persistent Volume.
According to the documentation (https://github.com/helm/charts/tree/master/stable/airflow#using-one-volume-for-both-logs-and-dags), it seems I have to set and pass these values to Helm: extraVolume
, extraVolumeMount
, persistence.enabled
, logsPersistence.enabled
, dags.path
, logs.path
.
Any custom values I pass when installing the official Helm chart results in errors similar to:
Error: YAML parse error on airflow/templates/deployments-web.yaml: error converting YAML to JSON: yaml: line 69: could not find expected ':'
microk8s.helm install --namespace "airflow" --name
"airflow" stable/airflow
microk8s.helm install --namespace "airflow" --name "airflow" stable/airflow \
--set airflow.extraVolumes=/home/*user*/github/airflowDAGs \
--set airflow.extraVolumeMounts=/home/*user*/github/airflowDAGs \
--set dags.path=/home/*user*/github/airflowDAGs/dags \
--set logs.path=/home/*user*/github/airflowDAGs/logs \
--set persistence.enabled=false \
--set logsPersistence.enabled=false
microk8s.helm install --namespace "airflow" --name "airflow" stable/airflow --values=values_pv.yaml
, with values_pv.yaml
: https://pastebin.com/PryCgKnC
/home/*user*/github/airflowDAGs
to a path on your machine to replicate the error.values.yaml
:## Configure DAGs deployment and update
dags:
##
## mount path for persistent volume.
## Note that this location is referred to in airflow.cfg, so if you change it, you must update airflow.cfg accordingly.
path: /home/*user*/github/airflowDAGs/dags
How do I configure airflow.cfg
in a Kubernetes deployement? In a non-containerized deployment of Airflow, this file can be found in ~/airflow/airflow.cfg
.
airflow.cfg
refers to: https://github.com/helm/charts/blob/master/stable/airflow/templates/deployments-web.yaml#L69Which contains git
. Are the .yaml
wrongly configured, and it falsely is trying to use git pull
, but since no git path is specified, this fails?
microk8s.kubectl version
: v1.15.4microk8s.helm version
: v2.14.3How do I correctly pass the right values to the Airflow Helm chart to be able to deploy Airflow on Kubernetes with Pods having access to the same DAGs and logs on a Shared Persistent Volume?
Not sure if you have this solved yet, but if you haven't I think there is a pretty simple way close to what you are doing.
All of the Deployments, Services, Pods need the persistent volume information - where it lives locally and where it should go within each kube kind. It looks like the values.yaml for the chart provides a way to do this. I'll only show this with dags below, but I think it should be roughly the same process for logs as well.
So the basic steps are, 1) tell kube where the 'volume' (directory) lives on your computer, 2) tell kube where to put that in your containers, and 3) tell airflow where to look for the dags. So, you can copy the values.yaml file from the helm repo and alter it with the following.
airflow
sectionFirst, you need to create a volume containing the items in your local directory (this is the extraVolumes
below). Then, that needs to be mounted - luckily putting it here will template it into all kube files. Once that volume is created, then you should tell it to mount dags
. So basically, extraVolumes
creates the volume, and extraVolumeMounts
mounts the volume.
airflow:
extraVolumeMounts: # this will get the volume and mount it to that path in the container
- name: dags
mountPath: /usr/local/airflow/dags # location in the container it will put the directory mentioned below.
extraVolumes: # this will create the volume from the directory
- name: dags
hostPath:
path: "path/to/local/directory" # For you this is something like /home/*user*/github/airflowDAGs/dags
airflow:
config:
AIRFLOW__CORE__DAGS_FOLDER: "/usr/local/airflow/dags" # this needs to match the mountPath in the extraVolumeMounts section
values.yaml
file.helm install --namespace "airflow" --name "airflow" -f local/path/to/values.yaml stable/airflow
In the end, this should allow airflow to see your local directory in the dags folder. If you add a new file, it should show up in the container - though it may take a minute to show up in the UI - I don't think the dagbag process is constantly running? Anyway, hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With