Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I make sure my airflow DAG runs one day at a time?

I want to set my DAG to run one day at a time. How can I achieve this?

I tried "depends on past=True", but it only makes sure each task is run subsequently. What I want is that, if I'm backfilling from day X, all tasks of day X are run before the DAG for day X+1 can start and so on.

like image 585
William Avatar asked Dec 26 '17 13:12

William


People also ask

How do you run a DAG daily?

rather than at a specific time, you can pass a timedelta object to the schedule interval. For example, schedule_interval=timedelta(minutes=30) will run the DAG every thirty minutes, and schedule_interval=timedelta(days=1) will run the DAG every day.

What is schedule interval in Airflow?

As Airflow has its scheduler and it adopts the schedule interval syntax from cron, the smallest data and time interval in the Airflow scheduler world is minute. Inside of the scheduler, the only thing that is continuously running is the scheduler itself.

How do I run a single task in Airflow?

You can run a task independently by using -i/-I/-A flags along with the run command. But yes the design of airflow does not permit running a specific task and all its dependencies.


1 Answers

You can use max_active_runs to control the number of active dag runs. Limiting it to one should satisfy your use case.

dag = airflow.DAG(
    'customer_staging',
    schedule_interval="@daily",
    dagrun_timeout=timedelta(minutes=60),
    template_searchpath=tmpl_search_path,
    default_args=args,
    max_active_runs=1)
like image 153
x97Core Avatar answered Nov 18 '22 10:11

x97Core