I have a very large table with 250,000+ rows, many containing a large text block in one of the columns. Right now it's 2.7GB and expected to grow at least tenfold. I need to perform python specific operations on every row of the table, but only need to access one row at a time.
Right now my code looks something like this:
c.execute('SELECT * FROM big_table') table = c.fetchall() for row in table: do_stuff_with_row
This worked fine when the table was smaller, but the table is now larger than my available ram and python hangs when I try and run it. Is there a better (more ram efficient) way to iterate row by row over the entire table?
SQLite Python: Querying Data First, establish a connection to the SQLite database by creating a Connection object. Next, create a Cursor object using the cursor method of the Connection object. Then, execute a SELECT statement. After that, call the fetchall() method of the cursor object to fetch the data.
The theoretical maximum number of rows in a table is 264 (18446744073709551616 or about 1.8e+19). This limit is unreachable since the maximum database size of 281 terabytes will be reached first.
Sqlite is a database storage engine, which can be better compared with things such as MySQL, PostgreSQL, Oracle, MSSQL, etc. It is used to store and retrieve structured data from files. SQLAlchemy is a Python library that provides an object relational mapper (ORM).
cursor.fetchall()
fetches all results into a list first.
Instead, you can iterate over the cursor itself:
c.execute('SELECT * FROM big_table') for row in c: # do_stuff_with_row
This produces rows as needed, rather than load them all first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With