I have a table with 4million rows and I use psycopg2 to execture a:
SELECT * FROM ..WHERE query
I haven't heard before of the server side cursor and I am reading its a good practice when you expect lots of results.
I find the documentation a bit limited and I have some basic questions.
First I declare the server-side cursor as:
cur = conn.cursor('cursor-name')
then I execute the query as:
cur.itersize = 10000 sqlstr = "SELECT clmn1, clmn2 FROM public.table WHERE clmn1 LIKE 'At%'" cur.execute(sqlstr)
My question is: What do I do now? How do I get the results?
Do I iterate through the rows as:
row = cur.fetchone() while row: row = cur.fetchone()
or I use fetchmany() and I do this:
row = cur.fetchmany(10)
But in the second case how can I "scroll" the results?
Also what is the point of itersize?
The Cursor class of the psycopg library provide methods to execute the PostgreSQL commands in the database using python code. Using the methods of it you can execute SQL statements, fetch data from the result sets, call procedures. You can create Cursor object using the cursor() method of the Connection object/class.
A server-side cursor enables a result set to be generated on the server side, but not transferred to the client except for those rows that the client requests. For example, if a client executes a query but is only interested in the first row, the remaining rows are not transferred.
Cursors are created by the connection. cursor() method: they are bound to the connection for the entire lifetime and all the commands are executed in the context of the database session wrapped by the connection.
Psycopg2 has a nice interface for working with server side cursors. This is a possible template to use:
with psycopg2.connect(database_connection_string) as conn: with conn.cursor(name='name_of_cursor') as cursor: cursor.itersize = 20000 query = "SELECT * FROM ..." cursor.execute(query) for row in cursor: # process row
The code above creates the connection and automatically places the query result into a server side cursor. The value itersize
sets the number of rows that the client will pull down at a time from the server side cursor. The value you use should balance number of network calls versus memory usage on the client. For example, if your result count is three million, an itersize
value of 2000 (the default value) will result in 1500 network calls. If the memory consumed by 2000 rows is light, increase that number.
When using for row in cursor
you are of course working with one row at a time, but Psycopg2 will prefetch itersize
rows at a time for you.
If you want to use fetchmany
for some reason, you could do something like this:
while True: rows = cursor.fetchmany(100) if len(rows) > 0: for row in rows: # process row else: break
This usage of fetchmany
will not trigger a network call to the server for more rows until the prefetched batch has been exhausted. (This is a convoluted example that provides nothing over the code above, but demonstrates how to use fetchmany
should there be a need.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With