I want to set my DAG to run one day at a time. How can I achieve this?
I tried "depends on past=True", but it only makes sure each task is run subsequently. What I want is that, if I'm backfilling from day X, all tasks of day X are run before the DAG for day X+1 can start and so on.
rather than at a specific time, you can pass a timedelta object to the schedule interval. For example, schedule_interval=timedelta(minutes=30) will run the DAG every thirty minutes, and schedule_interval=timedelta(days=1) will run the DAG every day.
As Airflow has its scheduler and it adopts the schedule interval syntax from cron, the smallest data and time interval in the Airflow scheduler world is minute. Inside of the scheduler, the only thing that is continuously running is the scheduler itself.
You can run a task independently by using -i/-I/-A flags along with the run command. But yes the design of airflow does not permit running a specific task and all its dependencies.
You can use max_active_runs
to control the number of active dag runs.
Limiting it to one should satisfy your use case.
dag = airflow.DAG(
'customer_staging',
schedule_interval="@daily",
dagrun_timeout=timedelta(minutes=60),
template_searchpath=tmpl_search_path,
default_args=args,
max_active_runs=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With