python sqlite3, how often do I have to commit?

Tags:

I have a for loop that is making many changes to a database with a sqlite manager class I wrote, but I am unsure about how often I have to commit...

for i in list:     c.execute('UPDATE table x=y WHERE foo=bar')     conn.commit()     c.execute('UPDATE table x=z+y WHERE foo=bar')     conn.commit()

Basically my question is whether I have to call commit twice there, or if I can just call it once after I have made both changes?

471

asked Mar 27 '16 03:03

Joff

1 Answers

Whether you call conn.commit() once at the end of the procedure of after every single database change depends on several factors.

What concurrent readers see

This is what everybody thinks of at first sight: When a change to the database is committed, it becomes visible for other connections. Unless it is committed, it remains visible only locally for the connection to which the change was done. Because of the limited concurrency features of sqlite, the database can only be read while a transaction is open.

You can investigate what happens by running the following script and investigating its output:

import os import sqlite3  _DBPATH = "./q6996603.sqlite"  def fresh_db():     if os.path.isfile(_DBPATH):         os.remove(_DBPATH)     with sqlite3.connect(_DBPATH) as conn:         cur = conn.cursor().executescript("""             CREATE TABLE "mytable" (                 "id" INTEGER PRIMARY KEY AUTOINCREMENT, -- rowid                 "data" INTEGER             );             """)     print "created %s" % _DBPATH  # functions are syntactic sugar only and use global conn, cur, rowid  def select():     sql = 'select * from "mytable"'     rows = cur.execute(sql).fetchall()     print "   same connection sees", rows     # simulate another script accessing tha database concurrently     with sqlite3.connect(_DBPATH) as conn2:         rows = conn2.cursor().execute(sql).fetchall()     print "   other connection sees", rows  def count():     print "counting up"     cur.execute('update "mytable" set data = data + 1 where "id" = ?', (rowid,))  def commit():     print "commit"     conn.commit()  # now the script fresh_db() with sqlite3.connect(_DBPATH) as conn:     print "--- prepare test case"     sql = 'insert into "mytable"(data) values(17)'     print sql     cur = conn.cursor().execute(sql)     rowid = cur.lastrowid     print "rowid =", rowid     commit()     select()     print "--- two consecutive w/o commit"     count()     select()     count()     select()     commit()     select()     print "--- two consecutive with commit"     count()     select()     commit()     select()     count()     select()     commit()     select()

Output:

$ python try.py  created ./q6996603.sqlite --- prepare test case insert into "mytable"(data) values(17) rowid = 1 commit    same connection sees [(1, 17)]    other connection sees [(1, 17)] --- two consecutive w/o commit counting up    same connection sees [(1, 18)]    other connection sees [(1, 17)] counting up    same connection sees [(1, 19)]    other connection sees [(1, 17)] commit    same connection sees [(1, 19)]    other connection sees [(1, 19)] --- two consecutive with commit counting up    same connection sees [(1, 20)]    other connection sees [(1, 19)] commit    same connection sees [(1, 20)]    other connection sees [(1, 20)] counting up    same connection sees [(1, 21)]    other connection sees [(1, 20)] commit    same connection sees [(1, 21)]    other connection sees [(1, 21)] $

So it depends whether you can live with the situation that a cuncurrent reader, be it in the same script or in another program, will be off by two at times.

When a large number of changes is to be done, two other aspects enter the scene:

Performance

The performance of database changes dramatically depends on how you do them. It is already noted as a FAQ:

Actually, SQLite will easily do 50,000 or more INSERT statements per second on an average desktop computer. But it will only do a few dozen transactions per second. [...]

It is absolutely helpful to understand the details here, so do not hesitate to follow the link and dive in. Also see this awsome analysis. It's written in C, but the results would be similar would one do the same in Python.

Note: While both resources refer to INSERT, the situation will be very much the same for UPDATE for the same arguments.

Exclusively locking the database

As already mentioned above, an open (uncommitted) transaction will block changes from concurrent connections. So it makes sense to bundle many changes to the database into a single transaction by executing them and the jointly committing the whole bunch of them.

Unfortunately, sometimes, computing the changes may take some time. When concurrent access is an issue you will not want to lock your database for that long. Because it can become rather tricky to collect pending UPDATE and INSERT statements somehow, this will usually leave you with a tradeoff between performance and exclusive locking.

163

answered Sep 23 '22 19:09

flaschbier

Related questions
                            
                                Why does a Java method reference with return type match the Consumer interface?
                            
                                Java 8 stream combiner never called
                            
                                Recursive Typing in Python 3.5+ [duplicate]
                            
                                NGINX - Reverse proxy multiple API on different ports
                            
                                Difference between constructor and connectedCallback in custom elements v1
                            
                                How to use Xamarin forms' Button.ContentLayout property?
                            
                                How can we read a json file as json object in golang
                            
                                Jackson date-format for OffsetDateTime in Spring Boot
                            
                                How to align rows in matplotlib legend with 2 columns
                            
                                Why std::make_unique instead of std::unique_ptr::make?
                            
                                Mocking extension function in Kotlin
                            
                                how to use directive @push in blade template laravel

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With