I'm running PostgreSQL 9.6 (in Docker, using the postgres:9.6.13 image) and psycopg2 2.8.2.
My PostgreSQL server (local) hosts two databases. My goal is to create materialized views in one of the databases that use data from the other database using Postgres's foreign data wrappers. I do all this from a Python script that uses psycopg2.
This works well as long as creating the materialized view does not take too long (i.e. if the amount of data being imported isn't too large). However, if the process takes longer than roughly ~250 seconds, psycopg2 throws the exception
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
No error message (or any message concerning this whatsoever) can be found in Postgres's logs.
Materialized view creation completes successfully if I do it from an SQL client (Postico).
This code illustrates roughly what I'm doing in the Python script:
db = pg.connect(
dbname=config.db_name,
user=config.db_user,
password=config.db_password,
host=config.db_host,
port=config.db_port
)
with db.cursor() as c:
c.execute("""
CREATE EXTENSION IF NOT EXISTS postgres_fdw;
CREATE SERVER fdw FOREIGN DATA WRAPPER postgres_fdw OPTIONS (...);
CREATE USER MAPPING FOR CURRENT_USER SERVER fdw OPTIONS (...);
CREATE SCHEMA foreign;
IMPORT FOREIGN SCHEMA foreign_schema FROM SERVER fdw INTO foreign;
""")
c.execute("""
CREATE MATERIALIZED VIEW IF NOT EXISTS my_view AS (
SELECT (...)
FROM foreign.foreign_table
);
""")
Prerequisites. The current psycopg2 implementation supports: Python versions from 3.6 to 3.11. PostgreSQL server versions from 7.4 to 15.
The default connection timeout limit to PostgreSQL data sources is 30 seconds.
From PostgreSQL wiki. Psycopg2 is a mature driver for interacting with PostgreSQL from the Python scripting language. It is written in C and provides a means to perform the full range of SQL operations against PostgreSQL databases.
The Cursor class of the psycopg library provide methods to execute the PostgreSQL commands in the database using python code. Using the methods of it you can execute SQL statements, fetch data from the result sets, call procedures. You can create Cursor object using the cursor() method of the Connection object/class.
Adding the keepalive
parameters to the psycopg2.connect
call seems to have solved the problem:
self.db = pg.connect(
dbname=config.db_name,
user=config.db_user,
password=config.db_password,
host=config.db_host,
port=config.db_port,
keepalives=1,
keepalives_idle=30,
keepalives_interval=10,
keepalives_count=5
)
I still don't know why this is necessary. I can't find anyone else who has described having to use the keepalives
parameter keywords when using Postgres in Docker just to be able to run queries that take longer than 4-5 minutes, but maybe it's obvious enough that nobody has noted it?
We encountered the same issue, and resolved it by adding net.ipv4.tcp_keepalive_time=200
to our docker-compose.yml
file:
services:
myservice:
image: myimage
sysctls:
- net.ipv4.tcp_keepalive_time=200
From what I understand this will signal that the connection is alive after 200 seconds, which is less than the time it takes to drop the connection (300 seconds?), thus preventing it from being dropped.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With