How can I configure airflow (mwaa) so that it will fire at the same time (6am PST) every day regards of when the dag is deployed?
I have tried what makes sense to me:
0 6 * * *
.now = datetime.utcnow()
now = now.replace(tzinfo=pendulum.timezone('America/Los_Angeles'))
previous_five_am = now.replace(hour = 5, minute = 0, second = 0, microsecond = 0)
start_date = previous_five_am
It seems that whenever I deploy by setting the start_date to 5am the previous day it would always fire at the next 6am no matter what time I deploy the dag or do a airflow update
Your confusion may be because you expect Airflow to schedule DAGs like cronjob when it's not.
The first DAG Run is created based on the minimum start_date
for the tasks in your DAG. Subsequent DAG Runs are created by the scheduler process, based on your DAG’s schedule_interval
, sequentially. Airflow schedule tasks at the END of the interval (See docs) you can view this answer for examples.
As for your sample code - never set your start_date
to be dynamic. It's a bad practice that can sometimes lead to DAG never being executed because now()
always moves to now() + interval
may never be reached see Airflow FAQ.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With