Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

apache-airflow 1.9 default timezone set to non utc

I had recently upgraded airflow version from airflow 1.8 to apache-airflow 1.9, the upgrade was successful and I have scaled the environment using Celery Executor, everything seemed to be working fine but the dag and tasks start dates, execution dates etc all are appearing in UTC timezone and the scheduled dags are running in UTC, earlier before the upgrade they used to run in Local timezone which is pdt.

Any ideas on how to make pdt as the default timezone in airflow?

I have tried using default_timezone in the airflow.cfg to default_timezone = pdt but even after restarting all the services it schedules the dags and tasks in UTC. Looking forward to your help on fixing the default timezone to pdt.

like image 433
Amit Kumar Avatar asked Jan 10 '18 18:01

Amit Kumar


People also ask

How do I change the default time zone in Airflow?

you can change it by setting the correct value of the timezone in the variable "AIRFLOW__CORE__DEFAULT_TIMEZONE" in airflow config file or from the env vars during the run time.

Is Airflow UTC time?

Airflow stores datetime information in UTC internally and in the database. It allows you to run your DAGs with time zone dependent schedules. At the moment, Airflow does not convert them to the end user's time zone in the user interface. It will always be displayed in UTC there.

What is Start_date in Airflow DAG?

Similarly, since the start_date argument for the DAG and its tasks points to the same logical date, it marks the start of the DAG's first data interval, not when tasks in the DAG will start running. In other words, a DAG run will only be scheduled one interval after start_date .

What is Airflow Execution_date?

Execution date or execution_date is a historical name for what is called a logical date, and also usually the start of the data interval represented by a DAG run. Airflow was developed as a solution for ETL needs.


2 Answers

Airflow running in the local timezone prior to version 1.9.0 was unintended and just a side effect of Airflow code using datetime.now() and datetime.today() instead of datetime.utcnow(). This was rectified in 1.9.0 under AIRFLOW-289, making things timezone independent (always UTC) as you have observed.

Official support for Airflow to be timezone aware is merged into the master branch. This work was completed as part of AIRFLOW-288 and is not available in the latest stable version (1.9.0). You can probably expect it in the next major release.

Once you have that change, Matt's answer should get you what you're looking for.

like image 126
Daniel Huang Avatar answered Nov 07 '22 09:11

Daniel Huang


According to these docs, the default_timezone accepts an IANA TZ Database time zone identifier. They are listed here.

If you want US Pacific Time, you should set default_timezone=America/Los_Angeles.

like image 34
Matt Johnson-Pint Avatar answered Nov 07 '22 08:11

Matt Johnson-Pint