Python is slow when iterating over a large list

Question

I am currently selecting a large list of rows from a database using pyodbc. The result is then copied to a large list, and then i am trying to iterate over the list. Before I abandon python, and try to create this in C#, I wanted to know if there was something I was doing wrong.

clientItems.execute("Select ids from largetable where year =?", year);
allIDRows = clientItemsCursor.fetchall() #takes maybe 8 seconds.

for clientItemrow in allIDRows:
    aID = str(clientItemRow[0])
    # Do something with str -- Removed because I was trying to determine what was slow
    count = count+1

Some more information:

The for loop is currently running at about 5 loops per second, and that seems insanely slow to me.
The total rows selected is ~489,000.
The machine its running on has lots of RAM and CPU. It seems to only run one or two cores, and ram is 1.72GB of 4gb.

Can anyone tell me whats wrong? Do scripts just run this slow?

Thanks

Pablo Santa Cruz · Accepted Answer

It's probably slow because you load all result in memory first and performing the iteration over a list. Try iterating the cursor instead.

And no, scripts shouldn't be that slow.

clientItemsCursor.execute("Select ids from largetable where year =?", year);
for clientItemrow in clientItemsCursor:
    aID = str(clientItemrow[0])
    count = count + 1

jsbueno · Answer

This should not be slow with Python native lists - but maybe ODBC's driver is returning a "lazy" object that tries to be smart but just gets slow. Try just doing

allIDRows = list(clientItemsCursor.fetchall())

in your code and post further benchmarks.

(Python lists can get slow if you start inserting things in its middle, but just iterating over a large list should be fast)

jkerian · Answer

More investigation is needed here... consider the following script:

bigList = range(500000)
doSomething = ""
arrayList = [[x] for x in bigList]  # takes a few seconds
for x in arrayList:
    doSomething += str(x[0])
    count+=1

This is pretty much the same as your script, minus the database stuff, and takes a few seconds to run on my not-terribly-fast machine.

Python is slow when iterating over a large list

Tags:

python

sql

database

pyodbc

nycynik

3 Answers

Pablo Santa Cruz

jsbueno

jkerian

Recent Activity

Donate For Us

Python is slow when iterating over a large list

Tags:

python

sql

database

pyodbc

nycynik

3 Answers

Pablo Santa Cruz

jsbueno

jkerian

Related questions

Recent Activity

Donate For Us