Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Helm stable/airflow - Custom values for Airflow deployment with Shared Persistent Volume using Helm chart failing

Objective

I want to deploy Airflow on Kubernetes where pods have access to the same DAGs, in a Shared Persistent Volume. According to the documentation (https://github.com/helm/charts/tree/master/stable/airflow#using-one-volume-for-both-logs-and-dags), it seems I have to set and pass these values to Helm: extraVolume, extraVolumeMount, persistence.enabled, logsPersistence.enabled, dags.path, logs.path.

Problem

Any custom values I pass when installing the official Helm chart results in errors similar to:

Error: YAML parse error on airflow/templates/deployments-web.yaml: error converting YAML to JSON: yaml: line 69: could not find expected ':'
  • Works fine: microk8s.helm install --namespace "airflow" --name "airflow" stable/airflow
  • Not working:
microk8s.helm install --namespace "airflow" --name "airflow" stable/airflow \
--set airflow.extraVolumes=/home/*user*/github/airflowDAGs \
--set airflow.extraVolumeMounts=/home/*user*/github/airflowDAGs \
--set dags.path=/home/*user*/github/airflowDAGs/dags \
--set logs.path=/home/*user*/github/airflowDAGs/logs \
--set persistence.enabled=false \
--set logsPersistence.enabled=false
  • Also not working: microk8s.helm install --namespace "airflow" --name "airflow" stable/airflow --values=values_pv.yaml, with values_pv.yaml: https://pastebin.com/PryCgKnC
    • Edit: Please change /home/*user*/github/airflowDAGs to a path on your machine to replicate the error.

Concerns

  1. Maybe it is going wrong because of these lines in the default values.yaml:
## Configure DAGs deployment and update
dags:
  ##
  ## mount path for persistent volume.
  ## Note that this location is referred to in airflow.cfg, so if you change it, you must update airflow.cfg accordingly.
  path: /home/*user*/github/airflowDAGs/dags

How do I configure airflow.cfg in a Kubernetes deployement? In a non-containerized deployment of Airflow, this file can be found in ~/airflow/airflow.cfg.

  1. Line 69 in airflow.cfg refers to: https://github.com/helm/charts/blob/master/stable/airflow/templates/deployments-web.yaml#L69

Which contains git. Are the .yaml wrongly configured, and it falsely is trying to use git pull, but since no git path is specified, this fails?

System

  • OS: Ubuntu 18.04 (single machine)
  • MicroK8s: v1.15.4 Rev:876
  • microk8s.kubectl version: v1.15.4
  • microk8s.helm version: v2.14.3

Question

How do I correctly pass the right values to the Airflow Helm chart to be able to deploy Airflow on Kubernetes with Pods having access to the same DAGs and logs on a Shared Persistent Volume?

like image 810
NumesSanguis Avatar asked Oct 17 '25 11:10

NumesSanguis


1 Answers

Not sure if you have this solved yet, but if you haven't I think there is a pretty simple way close to what you are doing.

All of the Deployments, Services, Pods need the persistent volume information - where it lives locally and where it should go within each kube kind. It looks like the values.yaml for the chart provides a way to do this. I'll only show this with dags below, but I think it should be roughly the same process for logs as well.

So the basic steps are, 1) tell kube where the 'volume' (directory) lives on your computer, 2) tell kube where to put that in your containers, and 3) tell airflow where to look for the dags. So, you can copy the values.yaml file from the helm repo and alter it with the following.

  1. The airflow section

First, you need to create a volume containing the items in your local directory (this is the extraVolumes below). Then, that needs to be mounted - luckily putting it here will template it into all kube files. Once that volume is created, then you should tell it to mount dags. So basically, extraVolumes creates the volume, and extraVolumeMounts mounts the volume.

airflow:
  extraVolumeMounts: # this will get the volume and mount it to that path in the container                                                                                                                                                               
  - name: dags
    mountPath: /usr/local/airflow/dags  # location in the container it will put the directory mentioned below.

  extraVolumes: # this will create the volume from the directory
  - name: dags
    hostPath:
      path: "path/to/local/directory"  # For you this is something like /home/*user*/github/airflowDAGs/dags

  1. Tell the airflow config where the dags live in the container (same yaml section as above).
airflow:
  config:
    AIRFLOW__CORE__DAGS_FOLDER: "/usr/local/airflow/dags"  # this needs to match the mountPath in the extraVolumeMounts section
  1. Install with helm and your new values.yaml file.
helm install --namespace "airflow" --name "airflow" -f local/path/to/values.yaml stable/airflow

In the end, this should allow airflow to see your local directory in the dags folder. If you add a new file, it should show up in the container - though it may take a minute to show up in the UI - I don't think the dagbag process is constantly running? Anyway, hope this helps!

like image 124
particularB Avatar answered Oct 19 '25 08:10

particularB