I can do very efficient bulk inserts in Sqlite3 on Python (2.7) with this code:
cur.executemany("INSERT INTO " + tableName + " VALUES (?, ?, ?, ?);", data)
But I can't get updates to work efficiently. I thought it might be a problem of the database structure/indexing, but even on a test database with only one table of 100 rows, the update still takes about 2-3 seconds.
I've tried different code variations. The latest code I have is from this answer to a previous question about update and executemany, but it's just as slow for me as any other attempt I've made:
data = []
for s in sources:
source_id = s['source_id']
val = get_value(s['source_attr'])
x=[val, source_id]
data.append(x)
cur.executemany("UPDATE sources SET source_attr = ? WHERE source_id = ?", data)
con.commit()
How could I improve this code to do a big bulk update efficiently?
The fetchall() method retrieves all the rows in the result set of a query and returns them as list of tuples. (If we execute this after retrieving few rows it returns the remaining ones). The fetchone() method fetches the next row in the result of a query and returns it as a tuple.
When inserting a record, the database just needs to write a row at the end of the table (unless you have something like UNIQUE constraints).
When updating a record, the database needs to find the row. This requires scanning through the entire table (for each command), unless you have an index on the search column:
CREATE INDEX whatever ON sources(source_id);
But if source_id
is the primary key, you should just declare it as such (which creates an implicit index):
CREATE TABLE sources(
source_id INTEGER PRIMARY KEY,
source_attr TEXT,
[...]
);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With