Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sqlite3 Python: How to do an efficient bulk update?

I can do very efficient bulk inserts in Sqlite3 on Python (2.7) with this code:

cur.executemany("INSERT INTO " + tableName + " VALUES (?, ?, ?, ?);", data)

But I can't get updates to work efficiently. I thought it might be a problem of the database structure/indexing, but even on a test database with only one table of 100 rows, the update still takes about 2-3 seconds.

I've tried different code variations. The latest code I have is from this answer to a previous question about update and executemany, but it's just as slow for me as any other attempt I've made:

data = []
for s in sources:
    source_id = s['source_id']
    val = get_value(s['source_attr'])
    x=[val, source_id]
    data.append(x)
cur.executemany("UPDATE sources SET source_attr = ? WHERE source_id = ?", data)
con.commit()

How could I improve this code to do a big bulk update efficiently?

like image 404
LunaiThi Avatar asked Apr 22 '16 02:04

LunaiThi


People also ask

What is Fetchall in sqlite3?

The fetchall() method retrieves all the rows in the result set of a query and returns them as list of tuples. (If we execute this after retrieving few rows it returns the remaining ones). The fetchone() method fetches the next row in the result of a query and returns it as a tuple.


1 Answers

When inserting a record, the database just needs to write a row at the end of the table (unless you have something like UNIQUE constraints).

When updating a record, the database needs to find the row. This requires scanning through the entire table (for each command), unless you have an index on the search column:

CREATE INDEX whatever ON sources(source_id);

But if source_id is the primary key, you should just declare it as such (which creates an implicit index):

CREATE TABLE sources(
    source_id INTEGER PRIMARY KEY,
    source_attr TEXT,
    [...]
);
like image 160
CL. Avatar answered Nov 14 '22 23:11

CL.