Airflow psycopg2.OperationalError: FATAL: sorry, too many clients already

Tags:

I have a four node clustered Airflow environment that's been working fine for me for a few months now.

ec2-instances

Server 1: Webserver, Scheduler, Redis Queue, PostgreSQL Database
Server 2: Webserver
Server 3: Worker
Server 4: Worker

Recently I've been working on a more complex DAG that has a few dozen tasks in it compared to my relatively small ones I was working on beforehand. I'm not sure if that's why I'm just now seeing this error pop up or what but I'll sporadically get this error:

On the Airflow UI under the logs for the task:

psycopg2.OperationalError: FATAL: sorry, too many clients already

And on the Webserver (output from running airflow webserver) I get the same error too:

[2018-07-23 17:43:46 -0400] [8116] [ERROR] Exception in worker process
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 2158, in _wrap_pool_connect
    return fn()
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 403, in connect
    return _ConnectionFairy._checkout(self)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 788, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 532, in checkout
    rec = pool._do_get()
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 1193, in _do_get
    self._dec_overflow()
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 187, in reraise
    raise value
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 1190, in _do_get
    return self._create_connection()
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 350, in _create_connection
    return _ConnectionRecord(self)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 477, in __init__
    self.__connect(first_connect_check=True)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 671, in __connect
    connection = pool._invoke_creator(self)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py", line 106, in connect
    return dialect.connect(*cargs, **cparams)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 410, in connect
    return self.dbapi.connect(*cargs, **cparams)
  File "/usr/local/lib64/python3.6/site-packages/psycopg2/__init__.py", line 130, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: FATAL:  sorry, too many clients already

I can fix this by running sudo /etc/init.d/postgresql restart and restarting the DAG but then after about three runs I'll start seeing the error again.

I can't find any specifics on this issue in regards to Airflow but from other posts I've found such as this one they're saying it's because my client (I guess in this case that's Airflow) is trying to open up more connections to PostgreSQL than what PostgreSQL is configured to handle. I ran this command to find that my PostgreSQL can accept 100 connections:

[ec2-user@ip-1-2-3-4 ~]$ sudo su
root@ip-1-2-3-4
[/home/ec2-user]# psql -U postgres
psql (9.2.24)
Type "help" for help.

postgres=# show max_connections;
 max_connections
-----------------
 100
(1 row)

In this solution the post says I can increase my PostgreSQL max connections but I'm wondering if I should instead set a value in my Airflow.cfg file so that I can match the Airflow allowed connections size to my PoastgreSQL max connections size. Does anyone know where I can set this value in Airflow? Here are the fields I think are relevant:

# The SqlAlchemy pool size is the maximum number of database connections
# in the pool.
sql_alchemy_pool_size = 5

# The SqlAlchemy pool recycle is the number of seconds a connection
# can be idle in the pool before it is invalidated. This config does
# not apply to sqlite.
sql_alchemy_pool_recycle = 3600

# The amount of parallelism as a setting to the executor. This defines
# the max number of task instances that should run simultaneously
# on this airflow installation
parallelism = 32

# The number of task instances allowed to run concurrently by the scheduler
dag_concurrency = 32

# When not using pools, tasks are run in the "default pool",
# whose size is guided by this config element
non_pooled_task_slot_count = 128

# The maximum number of active DAG runs per DAG
max_active_runs_per_dag = 32

Open to any suggestions for fixing this issue. Is this something related to my Airflow configuration or is it an issue with my PostgreSQL configuration?

Also, because I'm testing a new DAG I'll sometimes terminate the running tasks and start them over. Perhaps doing this is causing some of the processes to not die correctly and they're keeping dead connections open to PostgreSQL?

816

asked Jul 23 '18 21:07

Kyle Bridenstine

1 Answers

Ran into similar issue. I changed max_connections in postgres to 10000 and sql_alchemy_pool_size in airflow config to 1000. Now I am able to run hundreds of tasks in parallel.

PS: My machine has 32 cores and 60GB memory. Hence, its taking the load.

111

answered Sep 30 '22 07:09

Zanylytical Scientist

Related questions
                            
                                Insert file records into postgres db using clojure jdbc is taking long time compared to python psycopg2
                            
                                Auto generate id in Spring MVC
                            
                                Can someone explain how Postgresql roles, Postgresql users and Linux users relate to each other?
                            
                                postgres streaming replication - slave only index
                            
                                sequelize autoIncrement starting id for postgres
                            
                                How to rename an Amazon RDS hosted PostgreSQL database
                            
                                Restore database state after integration test using Docker?
                            
                                Connecting Excel 2016 to Postgres
                            
                                psycopg2: How to know when cur.rowcount does not mean number of rows?
                            
                                Using CTE (WITH queries) in Laravel query builder
                            
                                Simple SQL path query?
                            
                                How to connect to PostgreSQL from Phoenix Web App via SSL?
                            
                                PG::UnableToSend: no connection to the server in Rails
                            
                                How to generate a date range + count earlier dates from another table in PostgreSQL?
                            
                                Is this the correct way to bulk INSERT ON CONFLICT in Postgres?
                            
                                Hooking into start of JDBC transaction
                            
                                AWS Lambda with Knex JS and RDS Postgres
                            
                                java.math.BigInteger cannot be cast to java.math.BigDecimal
                            
                                pg_restore error: pg_restore: [archiver] unsupported version (1.13) in file header
                            
                                MySQL interprets SERIALIZABLE less strenuously than PostgreSQL. Is it correct?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Airflow psycopg2.OperationalError: FATAL: sorry, too many clients already

Tags:

postgresql

postgresql-9.2

airflow

Kyle Bridenstine

People also ask

1 Answers

Zanylytical Scientist

Recent Activity

Donate For Us