Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to resolve DB connection invalidated warning in Airflow Scheduler?

I am upgrading our Airflow instance from 1.9 to 1.10.3 and whenever the scheduler runs now I get a warning that the database connection has been invalidated and it's trying to reconnect. A bunch of these errors show up in a row. The console also indicates that tasks are being scheduled but if I check the database nothing is ever being written.

The following warning shows up where it didn't before

[2019-05-21 17:29:26,017] {sqlalchemy.py:81} WARNING - DB connection invalidated. Reconnecting...

Eventually, I'll also get this error

FATAL: remaining connection slots are reserved for non-replication superuser connections

I've tried to increase the SQL Alchemy pool size setting in airflow.cfg but that had no effect

# The SqlAlchemy pool size is the maximum number of database connections in the pool.
sql_alchemy_pool_size = 10

I'm using CeleryExecutor and I'm thinking that maybe the number of workers is overloading the database connections.

I run three commands, airflow webserver, airflow scheduler, and airflow worker, so there should only be one worker and I don't see why that would overload the database.

How do I resolve the database connection errors? Is there a setting to increase the number of database connections, if so where is it? Do I need to handle the workers differently?


Update:

Even with no workers running, starting the webserver and scheduler fresh, when the scheduler fills up the airflow pools the DB connection warning starts to appear.


Update 2:

I found the following issue in the Airflow Jira: https://issues.apache.org/jira/browse/AIRFLOW-4567

There is some activity with others saying they see the same issue. It is unclear whether this directly causes the crashes that some people are seeing or whether this is just an annoying cosmetic log. As of yet there is no resolution to this problem.

like image 236
trker Avatar asked May 21 '19 21:05

trker


1 Answers

This has been resolved in the latest version of Airflow, 1.10.4

I believe it was fixed by AIRFLOW-4332, updating SQLAlchemy to a newer version.

Pull request

like image 100
trker Avatar answered Oct 18 '22 20:10

trker