Python psycopg2 cursors

Tags:

From psycopg2 documentation:

When a database query is executed, the Psycopg cursor usually fetches all the records returned by the backend, transferring them to the client process. If the query returned an huge amount of data, a proportionally large amount of memory will be allocated by the client. If the dataset is too large to be practically handled on the client side, it is possible to create a server side cursor.

I would like to query a table with possibly thousands of rows and do some action for each one. Will normal cursors actually bring the entire data set on the client? That doesn't sound very reasonable. The code is something along the lines of:

Click to copy

conn = psycopg2.connect(url)
cursor = conn.cursor()
cursor.execute(sql)
for row in cursor:
    do some stuff
cursor.close()

I would expect this to be a streaming operation. And a second question is regarding the scope of cursors. Inside my loop I would like to do an update of another table. Do I need to open a new cursor and close every time? Each item update should be in its own transaction as I might need to do a rollback.

Click to copy

for row in cursor:
    anotherCursor = anotherConn.cursor()
    anotherCursor.execute(update)
    if somecondition:
        anotherConn.commit()
    else:
        anotherConn.rollback
cursor.close()

======== EDIT: MY ANSWER TO FIRST PART ========

Ok, I will try to answer the first part of my question. The normal cursors actually bring the entire data set as soon as you call execute, before even starting to iterate the result set. You can verify that by checking the memory footprint of the process at each step. But the need for a server side cursor is actually due to postgres server and not the client, and is documented here: http://www.postgresql.org/docs/9.3/static/sql-declare.html

Now, this is not immediately apparent from the documentation, but such cursors can actually be temporarily created for the duration of the transaction. There is no need to explicitly create a function that returns a refcursor in the database, with the specific SLQ statement, etc. With psycopg2 you only need to give a name while obtaining the cursor and a temporary cursor will be created for that transaction. So instead of:

Click to copy

 cursor = conn.cursor()

you just need to to:

Click to copy

 cursor = conn.cursor('mycursor')

That's it and it works. I assume the same thing is done under the covers when using JDBC, when setting fetchSize. It's just a bit more transparent. See docs here: https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor

You can test that this works by querying the pg_cursors view inside the same transaction. The server side cursor appears after obtaining the client side cursor and disappears after closing the client side cursor. So bottom line: I'm happy to do that change to my code, but I must say this was a big gotcha for someone not that experienced with postgres.

891

asked May 24 '15 19:05

Nazaret K.

1 Answers

Actually, you have already answered the question ;).

Yes you should use server side cursor to get records streamed http://initd.org/psycopg/docs/usage.html#server-side-cursors

From docs:

Click to copy

CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS $$
BEGIN
    OPEN $1 FOR SELECT col FROM test;
    RETURN $1;
END;
$$ LANGUAGE plpgsql;

And in code:

Click to copy

cur1 = conn.cursor()
cur1.callproc('reffunc', ['curname'])

cur2 = conn.cursor('curname')
for record in cur2:     # or cur2.fetchone, fetchmany...
    # do something with record
    pass

Yes you should open new cursor, if you wanna get rows with server side cursor.

164

answered Sep 16 '22 14:09

kwarunek

Related questions
                            
                                Is there a way to stop a running process in concurrent.futures?
                            
                                Why I cannot use python module concurrent.futures in class method?
                            
                                Python - Can't kill main thread with KeyboardInterrupt
                            
                                Concatenation of numpy arrays of unknown dimension along arbitrary axis
                            
                                Supervised Dimensionality Reduction for Text Data in scikit-learn
                            
                                use ipython to get REAL code-completion in pycharm
                            
                                Correct exception handling with python MySQLdb connection
                            
                                Debugging inside PyCharm IPython
                            
                                mrjob: Invalid bootstrap action path, must be a location in Amazon S3
                            
                                Django unittest read-only test databases
                            
                                Correct way to check if Pandas DataFrame index is a certain type (DatetimeIndex)
                            
                                Sampling groups in Pandas
                            
                                Is it cheaper to reverse an appended list or to prepend a list? - python
                            
                                Single sign on to Django site via remote Active Directory
                            
                                setuptools finds wrong package during install
                            
                                celery beat schedule: run task instantly when start celery beat?
                            
                                Python Logging with a common logger class mixin and class inheritance
                            
                                What is the best way to deal with "_d" suffix for C extensions when using debug build?
                            
                                A more complex version of "How can I tell if a string repeats itself in Python?"
                            
                                reserved keyword is used in protobuf in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python psycopg2 cursors

Tags:

python

psycopg2

python-2.7

database-cursor

Nazaret K.

People also ask

1 Answers

kwarunek

Recent Activity

Donate For Us