I am running airflow using postgres.
There was a phenomenon that the web server was slow during operation.
It was a problem caused by data continuing to accumulate in dag_run and log of the db table (it became faster by accessing postgres and deleting data directly).
Are there any airflow options to clean the db periodically?
If there is no such option, we will try to delete the data directly using the dag script.
And I think it's strange that the web server slows down because there is a lot of data. Does the web server get all the data when opening another window?
You can purge old records by running:
airflow db clean [-h] --clean-before-timestamp CLEAN_BEFORE_TIMESTAMP [--dry-run] [--skip-archive] [-t TABLES] [-v] [-y]
(cli reference)
It is a quite common setup to include this command in a DAG that runs periodically.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With