Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

performance of insert with python and sqlite3

Tags:

python

sql

sqlite

I'm doing big batch inserts into an SQLite3 database and I'm trying to get a sense for what sort of performance I should be expecting versus what I'm actually seeing.

My table looks like this:

cursor.execute(
            "CREATE TABLE tweets(
             tweet_hash TEXT PRIMARY KEY ON CONFLICT REPLACE,
             tweet_id INTEGER,
             tweet_text TEXT)"
        )

and my inserts look like this:

cursor.executemany("INSERT INTO tweets VALUES (?, ?, ?)", to_write)

where to_write is a list of tuples.

Currently, with about 12 million rows in the database, inserting 50 000 rows is taking me around 16 minutes, running on a 2008 macbook.

Does this sound reasonable, or is there something gross happening?

like image 962
cmyr Avatar asked Jul 25 '13 19:07

cmyr


People also ask

Is Executemany faster than execute?

executemany() skyrockets. For example, on my machine inserting 1,000 rows into the same table in a database on the local network using cursor. execute() takes 410 ms, whereas using cursor. executemany() requires only 20 ms.

Is SQLite good for Python?

Python SQLite3 module is used to integrate the SQLite database with Python. It is a standardized Python DBI API 2.0 and provides a straightforward and simple-to-use interface for interacting with SQLite databases.

Is SQLite slow?

The SQLite docs explains why this is so slow: Transaction speed is limited by disk drive speed because (by default) SQLite actually waits until the data really is safely stored on the disk surface before the transaction is complete. That way, if you suddenly lose power or if your OS crashes, your data is still safe.

Is SQLite fast?

SQLite is fast and reliable and it requires no configuration or maintenance. It keeps things simple. SQLite "just works".


1 Answers

As I understand the main reason of bad performance is time you waste to commit many SQLite transactions. What to do?

Drop the indexes, then

PRAGMA synchronous = OFF (or NORMAL)

Insert blocks of N rows (define N, try N=5000 to start). Before inserting block do

BEGIN TRANSACTION

after inserting do

COMMIT

See also http://www.sqlite.org/faq.html#q19

like image 92
drastega Avatar answered Sep 22 '22 03:09

drastega