I am currently trying to setup Airflow to work in a Kubernetes like environment. For airflow to be useful, I need to be able to use the Git-Sync features so that the DAGs can be stored seperatly from the Pod, thus not being reset when the Pod downscales or restarts. I am trying to set it up with ssh.
I have been searching for good documentation on the Airflow config or tutorials on how to set this up properly, but this has been to no avail. I would very much appreciate some help here, as I have been struggling with this for a while.
Here is how i set the relevant config, please note I have some stand ins for links and some information due to security reasons:
git_repo = https://<git-host>/scm/<project-name>/airflow
git_branch = develop
git_subpath = dags
git_sync_root = /usr/local/airflow
git_sync_dest = dags
git_sync_depth = 1
git_sync_ssh = true
git_dags_folder_mount_point = /usr/local/airflow/dags
git_ssh_key_secret_name = airflow-secrets
git_ssh_known_hosts_configmap_name = airflow-configmap
dags_folder = /usr/local/airflow/
executor = KubernetesExecutor
dags_in_image = False
Here is how I have setup my origin/config repo:
-root
|-configmaps/airflow
|-airflow.cfg
|-airflow-configmap.yaml
|-environment
|-<environment specific stuff>
|-secrets
|-airflow-secrets.yaml
|-ssh
|-id_rsa
|-id_rsa.pub
|-README.md
The airflow-conifgmap and secrets look like this:
apiVersion: v1
kind: Secret
metadata:
name: airflow-secrets
data:
# key needs to be gitSshKey
gitSshKey: <base64 encoded private sshKey>
and
apiVersion: v1
kind: ConfigMap
metadata:
name: airflow-configmap
data:
known_hosts: |
https://<git-host>/ ssh-rsa <base64 encoded public sshKey>
The repo that I am trying to sync to has the Public key set as an access key and is just a folder named dags with 1 dag inside.
My issue is that I do not know what my issue is at this point. I have no way of knowing what part of my config has been set correctly and what part of it is set incorrectly and documentation on the subject is very lackluster.
If there is more information that is required I will be happy to provide it.
Thank you for your time
Whats the error you're seeing on doing this ?
Couple of things you need to consider:
Create an SSH key locally using this link and:
Repository Name > Settings > Deploy Keys > Value of ssh_key.pub
Ensure "write access" is checked
My Dockerfile
I'm using looks like:
FROM apache/airflow:2.1.2
COPY requirements.txt .
RUN python -m pip install --upgrade pip
RUN pip install -r requirements.txt
The values.yaml
from the official Airflow Helm repository (helm repo add apache-airflow https://airflow.apache.org
) needs the following values updated under gitSync
:
enabled: true
repo: ssh://[email protected]/username/repository-name.git
branch: master
subPaths: ""
(if DAGs are in repository root)
sshKeySecret: airflow-ssh-git-secret
credentialsSecret: git-credentials
Export SSH key and known_hosts
to Kubernetes secret for accessing the private repository
kubectl create secret generic airflow-ssh-git-secret \
--from-file=gitSshKey=/path/to/.ssh/id_ed25519 \
--from-file=known_hosts=/path/to/.ssh/known_hosts \
--from-file=id_ed25519.pub=/path/to/.ssh/id_ed25519.pub \
-n airflow
Create and apply manifests:
apiVersion: v1
kind: Secret
metadata:
namespace: airflow
name: airflow-ssh-git-secret
data:
gitSshKey: <base64_encoded_private_key_id_ed25519_in_one_line>
apiVersion: v1
kind: Secret
metadata:
name: git-credentials
data:
GIT_SYNC_USERNAME: base64_encoded_git_username
GIT_SYNC_PASSWORD: base64_encoded_git_password
apiVersion: v1
kind: ConfigMap
metadata:
namespace: airflow
name: known-hosts
data:
known_hosts: |
line 1 of known_host file
line 2 of known_host file
line 3 of known_host file
...
Update Airflow release
helm upgrade --install airflow apache-airflow/airflow -n airflow -f values.yaml --debug
Get pods in the airflow namespace
kubectl get pods -n airflow
The airflow-scheduler-SOME-STRING
pod is going to have 3 containers running. View the logs of container git-sync-init
if you dont see the pods in Running state
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With