Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres closes connection during query after a few hundred seconds when using Psycopg2

I'm running PostgreSQL 9.6 (in Docker, using the postgres:9.6.13 image) and psycopg2 2.8.2.

My PostgreSQL server (local) hosts two databases. My goal is to create materialized views in one of the databases that use data from the other database using Postgres's foreign data wrappers. I do all this from a Python script that uses psycopg2.

This works well as long as creating the materialized view does not take too long (i.e. if the amount of data being imported isn't too large). However, if the process takes longer than roughly ~250 seconds, psycopg2 throws the exception

psycopg2.OperationalError: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.

No error message (or any message concerning this whatsoever) can be found in Postgres's logs.

Materialized view creation completes successfully if I do it from an SQL client (Postico).

This code illustrates roughly what I'm doing in the Python script:

db = pg.connect(
    dbname=config.db_name,
    user=config.db_user,
    password=config.db_password,
    host=config.db_host,
    port=config.db_port
)
with db.cursor() as c:
    c.execute("""
        CREATE EXTENSION IF NOT EXISTS postgres_fdw;
        CREATE SERVER fdw FOREIGN DATA WRAPPER postgres_fdw OPTIONS (...);
        CREATE USER MAPPING FOR CURRENT_USER SERVER fdw OPTIONS (...);
        CREATE SCHEMA foreign;
        IMPORT FOREIGN SCHEMA foreign_schema FROM SERVER fdw INTO foreign;
    """)
    c.execute("""
        CREATE MATERIALIZED VIEW IF NOT EXISTS my_view AS (
            SELECT (...)
            FROM foreign.foreign_table
        );
    """)
like image 716
haroba Avatar asked May 24 '19 09:05

haroba


People also ask

Does psycopg2 need PostgreSQL?

Prerequisites. The current psycopg2 implementation supports: Python versions from 3.6 to 3.11. PostgreSQL server versions from 7.4 to 15.

What is connection timeout in PostgreSQL?

The default connection timeout limit to PostgreSQL data sources is 30 seconds.

What is PostgreSQL psycopg2?

From PostgreSQL wiki. Psycopg2 is a mature driver for interacting with PostgreSQL from the Python scripting language. It is written in C and provides a means to perform the full range of SQL operations against PostgreSQL databases.

What is cursor in Psycopg?

The Cursor class of the psycopg library provide methods to execute the PostgreSQL commands in the database using python code. Using the methods of it you can execute SQL statements, fetch data from the result sets, call procedures. You can create Cursor object using the cursor() method of the Connection object/class.


2 Answers

Adding the keepalive parameters to the psycopg2.connect call seems to have solved the problem:

self.db = pg.connect(
            dbname=config.db_name,
            user=config.db_user,
            password=config.db_password,
            host=config.db_host,
            port=config.db_port,
            keepalives=1,
            keepalives_idle=30,
            keepalives_interval=10,
            keepalives_count=5
        )

I still don't know why this is necessary. I can't find anyone else who has described having to use the keepalives parameter keywords when using Postgres in Docker just to be able to run queries that take longer than 4-5 minutes, but maybe it's obvious enough that nobody has noted it?

like image 52
haroba Avatar answered Nov 09 '22 02:11

haroba


We encountered the same issue, and resolved it by adding net.ipv4.tcp_keepalive_time=200 to our docker-compose.yml file:

services:
  myservice:
    image: myimage
    sysctls:
      - net.ipv4.tcp_keepalive_time=200

From what I understand this will signal that the connection is alive after 200 seconds, which is less than the time it takes to drop the connection (300 seconds?), thus preventing it from being dropped.

like image 1
André C. Andersen Avatar answered Nov 09 '22 03:11

André C. Andersen