I want to write a generator function that will run on a memory limited system that uses PyMySql (or MySQLDb) to return the results of a select query one at a time. The following works:
#execute a select query and return results as a generator
def SQLSelectGenerator(self,stmt):
#error handling code removed
cur.execute(stmt)
row = ""
while row is not None:
row = self.cur.fetchone()
yield row
However, the following also seems to work, but it is cryptic as to whether it is performing a fetchall(). I can't find in the Python DB API what exactly happens when you iterate a cursor object as a list:
#execute a select query and return results as a generator
def SQLSelectGenerator(self,stmt):
#error handling code removed
cur.execute(stmt)
for row in self.cur:
yield row
In both cases, the following prints all the rows sucessfully
stmt = "select * from ..."
for l in SQLSelectGenerator(stmt):
print(l)
So I would like to know whether the second implementation is better or worse, and whether it is calling fetchall or doing something tricky with fetchone. Fetchall will blow up the system this will run on because there are millions of rows.
According to the PyMySql source, doing
for row in self.cur:
yield row
Meaning you're internally executing fetchone()
repeatedly, just like your first example:
class Cursor(object):
'''
This is the object you use to interact with the database.
'''
...
def __iter__(self):
return iter(self.fetchone, None)
So I would expect the two approaches to be essentially equal in terms of memory usage and performance. You may as well use the second one since its cleaner and simpler.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With