Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Airflow DAG Versioning

Tags:

airflow

Is DAG versioning a thing ? I can't find much on the subject with a few Google searches. I would like to look at the DAGs screen in Airflow and be sure of what DAG code is in the wild.

The simplest solution would be to include a version number as part of the dag_id, but I would appreciate knowing if anyone has better, alternative solution. Tags would work too and migjht look good in the UI - they are designed for for filtering though, I'm not sure if there would be undesirable side-effects.

like image 907
Phil Avatar asked May 14 '20 11:05

Phil


2 Answers

As the author of the DAG Versioning AIP, I can say that this work has been deferred post 2.0 mainly to support end-to-end DAG Versioning.

Originally, we (Airflow Core Committers) were planning to have a Webserver-only DAG Versioning i.e. to improve the visibility behaviour but not execution:

The scope of this AIP to make sure that the visibility behavior of Airflow is correct, without changing the execution behaviour which will continue to be based on the most recent version of the DAG.

This means it overcomes the issues where you can go back to an old version of the DAG, to view the shape of the DAG few months back and you can see the correct representation instead of "always-latest".

Currently, Airflow suffers from the issue where if you add/remove a task, it gets added/removed in all the previous DagRuns in the Webserver.

However, what we have decided is that we will accomplish Remote DAG Fetcher + DAG Versioning and enable versioning of DAG on the worker side, so a user will be able to run a DAG with the previous version too.

Currently, we don't have a date but mostly planning to do it around the end of 2021.

like image 71
kaxil Avatar answered Sep 17 '22 15:09

kaxil


The Airflow project has a draft feature open to support DAG versions. The answer currently is that Airflow does not support versions.

The first use case in the link describes a key limitation, log files from previous runs can only surface nodes from the current DAG.

like image 31
Merlin Avatar answered Sep 17 '22 15:09

Merlin