Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get psycopg2's description from PostgreSQL server side cursor

I built a class that gets an arbitrary Postgres SQL query, fetch data and creates a CSV file. I'm using cursor.description to get column names, passing it as my csv header. However data sets got too large and I'm moving to server side cursors.

Server Side cursors doesn't seem to have any data under description. When I run:

import psycopg2

conn = psycopg2.connect(**conn_info)
cursor = conn.cursor("server_side")
cursor.execute("select * from foo")
print(cursor.description)

It prints None, probably because query didn't actually ran. But is there a way to get column names in this configuration?

like image 414
Brenno Flávio Avatar asked Jul 20 '19 22:07

Brenno Flávio


People also ask

What is a cursor Psycopg?

class cursor. Allows Python code to execute PostgreSQL command in a database session. Cursors are created by the connection. cursor() method: they are bound to the connection for the entire lifetime and all the commands are executed in the context of the database session wrapped by the connection.

What is a server side cursor?

A server-side cursor enables a result set to be generated on the server side, but not transferred to the client except for those rows that the client requests. For example, if a client executes a query but is only interested in the first row, the remaining rows are not transferred.


2 Answers

The query in cursor.execute('select...') is executed on the server side but the application has no data yet, hence cursor.description is undefined. To get the description you need to get at least a row from the server-side cursor, e.g.:

cursor = conn.cursor("server_side")
# or
# cursor = conn.cursor("server_side", scrollable=True)
# see below
cursor.execute("select * from my_table")
first_row = cursor.fetchone()
print(cursor.description)
# you can place the cursor in the initial position if needed:
# cursor.scroll(-1) 

Note that you won't get the description when the table is empty.

There is no better (faster or simpler) way to get a query result description of a named cursor. This is due to the way named cursors are implemented. The commands

cursor = conn.cursor("server_side")
cursor.execute("select * from my_table")

are implemented by declaring a cursor using the Postgres command:

DECLARE "server_side" CURSOR WITHOUT HOLD FOR select * from my_table

Per the documentation:

DECLARE allows a user to create cursors, which can be used to retrieve a small number of rows at a time out of a larger query. After the cursor is created, rows are fetched from it using FETCH.

The cursor declaration itself does not give any information on the structure of the results. We can obtain it only after fetching a row or rows by the FETCH command.

like image 57
klin Avatar answered Oct 16 '22 20:10

klin


The other answers here are; unfortunately, the answer and here's why. There is no way to get the description or even rowcount back from a server-side cursor without first invoking a fetch. It returns None as per PEP-249:

This attribute will be None for operations that do not return rows or if the cursor has not had an operation invoked via the .execute*() method yet.

This is because even though you've called execute, the server may not have yet executed the query and we can confirm that by checking the logs (where logging is set to all)

Using the following code with a 30-second sleep for clarity

cursor = conn.cursor("server_side")
cursor.execute("select * from foo")
time.sleep(30)
cursor.fetchall()
print(cursor.description)

The logs will show

2020-06-19 12:11:37.687 BST [11916] LOG:  statement: BEGIN
2020-06-19 12:11:37.687 BST [11916] LOG:  statement: DECLARE "server_side" CURSOR WITHOUT HOLD FOR select * from foo
2020-06-19 12:12:07.693 BST [11916] LOG:  statement: FETCH FORWARD ALL FROM "server_side"

Notice the 30~ second gap between the declaration and the FETCH, the latter being the invocation that allows us to get the description from the cursor.

Without server_side for comparison

2020-06-19 12:11:01.310 BST [3012] LOG:  statement: BEGIN
2020-06-19 12:11:01.311 BST [3012] LOG:  statement: select * from foo

Your only options are to use scroll or perform a LIMIT 1 select prior to your larger query.

A less attractive option is to use the INFORMATION_SCHEMA table like so

select column_name, data_type, character_maximum_length
from INFORMATION_SCHEMA.COLUMNS where table_name = 'foo';
like image 40
Lucan Avatar answered Oct 16 '22 20:10

Lucan