Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I use the same cursor while looping through it?

I am iterating through a SELECT result, like this:

import MySQLdb

conn = MySQLdb.connect(host = 127.0.0.1, user = ...) # and so on
cur = conn.cursor()

cur.execute("SELECT * FROM some_table")

for row in cur:
    # some stuff I'm doing
    # sometimes I need to perform another SELECT here

The question is, can I use cur again inside the for loop, or do I have to create another cursor (or even more - another connection)?

I guess I am missing some basic knowledge about databases or Python here... I am actually quite new with both. Also, my attempts to google the answer have failed.

I would even guess myself that I have to create another cursor, but I think I have actually used it for some time like this before I realized that it might be wrong and it seemed to work. But I am a bit confused now and can't guarantee it. So I just want to make sure.

like image 905
zbr Avatar asked Jan 07 '14 21:01

zbr


1 Answers

You have to create a new cursor. Otherwise, cur is now holding the results of your new "inner" select instead of your "outer" one.

This may work anyway, depending on your database library and your luck, but you shouldn't count on it. I'll try to explain below.

You don't need a new connection, however.

So:

cur.execute("SELECT * FROM some_table")

for row in cur:
    # some stuff I'm doing
    inner_cur = conn.cursor()
    inner_cur.execute("SELECT * FROM other_table WHERE column = row[1]")
    for inner_row in inner_cur:
        # stuff

So, why does it sometimes work?

Well, let's look at what a for row in cur: loop really does under the covers:

temp_iter = iter(cur)
while True:
    try:
        row = next(temp_iter)
    except StopIteration:
        break
    # your code runs here

Now, that iter(cur) calls the __iter__ method on the cursor object. What does that do? That's up to cur, an object of the cursor type provided by your database library.

The obvious implementation is to return some kind of iterator that has a reference to either the cursor object, or to the same row collection that the cursor object is using under the covers. This is what happens when you call iter on a list, for example.

But there's nothing requiring the database library to implement its __iter__ that way. It could create a copy of the row set for the iterator to use. Or, more plausibly, it could make the iterator refer to the current row set… but then change the cursor itself to refer to a different one when you next call execute. If it does that, then the old iterator keeps reading the old row set, and you can get a new iterator that iterates the new row set.

You shouldn't rely on that happening just because a database library is allowed to do that. But you also shouldn't rely on it not happening, of course.

In more concrete terms, imagine this type:

class Cursor(object):
    # other stuff
    def __iter__(self):
        return iter(self.rowset)
    def execute(self, sql, *args):
        self.rowset = self.db.do_the_real_work(sql, *args)
like image 65
abarnert Avatar answered Sep 17 '22 23:09

abarnert