Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Re-run part of an Airflow Subdag

I have a daily Dag that contains a subdag. The subdag has five tasks, T1 through T5, that must run in order (e.g. T1 >> T2 >> T3 >> T4 >> T5)

The dag successfully runs for a few days but then I discover a bug with T4. I fix the bug and want to re-run just T4 and T5 for all previous days. It's important to NOT re-run T1-T3 because these steps take a lot longer than T4-T5.

What I've tried that has failed:

  1. Select T4, Clear downstream+recursive - Nothing happens. The Dag tree view shows the subdag as "success" even though T4 and T5 within it are cleared.
  2. Select T4, clear downstream+recursive, select subdag, clear just that task - This will re-run the entire subdag (T1-T5) even though T1-T3 were marked as success
  3. Select T4, clear downstream+recursive, select subdag, click run - Same as #2. Re-runs entire subdag.
  4. Select T4, clear downstream+recursive, manually set the subdag to "running" state. Nothing happens. The tree view shows the subdag in the "running" state but no tasks actually get run.

This seems to only be a problem when trying to re-run part of a subdag. If I have a bunch of tasks in a regular dag, normally selecting a task in the middle and selecting clear downstream+recursive will re-run the dag from that point.

Any suggestions would be appreciated.

like image 576
stipe108 Avatar asked Jan 18 '18 22:01

stipe108


People also ask

What is Subdag in Airflow?

SubDAGs were a legacy feature in Airflow that allowed users to implement reusable patterns of tasks in their DAGs. SubDAGs caused performance and functional issues for many users, and they have been deprecated as of Airflow 2.0 and will be removed entirely in a future release.

What is catchup in Airflow?

Catchup. An Airflow DAG defined with a start_date , possibly an end_date , and a non-dataset schedule, defines a series of intervals which the scheduler turns into individual DAG runs and executes.


1 Answers

You can restart the failed tasks inside a subDAG, this is how:

  • Zoom into the subDAG, clear the status of failed tasks.
  • Go back to main DAG, select the subDAG.
  • Uncheck Recursive and/or Downstream.
  • Clear status of subDAG.
like image 90
Toan Tran Avatar answered Oct 09 '22 10:10

Toan Tran