Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

airflow create subdag with a different schedule_interval than parent dag

I've been trying to set up a parent dag that has two subdags, each runs at a slightly different time due to available of their respective data sources. However, the subdags seem to be kicked off immediately with the parent dag, disregarding their own schedule_intervals. Does someone know if this is the default behavior for airflow? Is there a way to get around that without turning them into standalone dags or using sensors?

like image 202
user3582076 Avatar asked Jan 29 '23 08:01

user3582076


1 Answers

The subdag is going to obey the parent dag schedule (since it's the parent that triggers the subdag) and won't run on its own schedule unless it's configured to do so as a standalone dag.

Probably what you really want is some other type of dependency mechanism. I'm trying to guess what's your scenario here:

  1. You have DagA and DagB that each runs on different times of the day
  2. DagB depends on DagA (or some DagC depends on DagA and DagB)
  3. You created a DagX that has DagA and DagB as subdags to control the dependencies

I'm not sure why wouldn't you want DagA and DagB to be standalone Dags, but if you really want to preserve your structure you can set the parent DAG schedule to be the greatest common divisor of the schedules from DagA and DagB and add conditional flows to avoid executing them if they're not due.

On the other hand, I would suggest you to try to map dependencies directly with code instead of making them implicit with scheduling. If DagA depends on something external, be it a data source or another DAG, you can use a Sensor.

like image 101
villasv Avatar answered Feb 16 '23 16:02

villasv