Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is airflow database managed periodically?

I am running airflow using postgres.

There was a phenomenon that the web server was slow during operation.

It was a problem caused by data continuing to accumulate in dag_run and log of the db table (it became faster by accessing postgres and deleting data directly).

Are there any airflow options to clean the db periodically?

If there is no such option, we will try to delete the data directly using the dag script.

And I think it's strange that the web server slows down because there is a lot of data. Does the web server get all the data when opening another window?

like image 453
user14989010 Avatar asked Dec 04 '25 01:12

user14989010


1 Answers

You can purge old records by running:

airflow db clean [-h] --clean-before-timestamp CLEAN_BEFORE_TIMESTAMP [--dry-run] [--skip-archive] [-t TABLES] [-v] [-y]

(cli reference)

It is a quite common setup to include this command in a DAG that runs periodically.

like image 90
TJaniF Avatar answered Dec 07 '25 03:12

TJaniF