Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge SQLite files into one db file, and 'begin/commit' question

This post refers to this page for merging SQLite databases.

The sequence is as follows. Let's say I want to merge a.db and b.db. In command line I do the following.

  • sqlite3 a.db
  • attach 'b.db' as toM;
  • begin; <--
  • insert into benchmark select * from toM.benchmark;
  • commit; <--
  • detach database toM;

It works well, but in the referred site, the questioner asks about speeding up, and the answer is to use the 'begin' and 'commit' command.

Then, I came up with the following python code to do the exactly same thing. I abstract the SQLite function calls with SQLiteDB, and one of it's method is runCommand(). I got the same error even though I delete the self.connector.commit().

# run command
def runCommand(self, command):
    self.cursor.execute(command)
    self.connector.commit() # same error even though I delete this line

db = SQLiteDB('a.db')
cmd = "attach \"%s\" as toMerge" % "b.db"
print cmd
db.runCommand(cmd)
cmd = "begin"
db.runCommand(cmd)
cmd = "insert into benchmark select * from toMerge.benchmark"
db.runCommand(cmd)
cmd = "commit"
db.runCommand(cmd)
cmd = "detach database toMerge"
db.runCommand(cmd)

But, I got the following error.

OperationalError: cannot commit - no transaction is active

Even though the error, the result db is well merged. And without the begin/commit, there's no error at all.

  • Why can't I run the begin/commit command?
  • Is it absolutely necessary to run begin/commit to safely merge the db files? The post says that the purpose of begin/commit is for speedup. Then, what's the difference between using and not using the begin/commit command in terms of speedup?
like image 437
prosseek Avatar asked Sep 11 '10 03:09

prosseek


People also ask

How do I merge two SQLite databases in Python?

In SQLiteStudio, you may simply load two DBs, and drag and drop a table from one DB into another, it will get saved automatically.

Can SQLite have multiple databases?

All you need is a disk volume you can create as many SQLite databases as you like. A lesser known feature of SQLite is that you can run queries, including joins, across tables from more than one database. The secret sauce is the ATTACH DATABASE command.

Is SQLite good for beginners?

SQLite is used as a temporary dataset to get processed with some data within an application. Beginners use SQLite for learning and training purposes, as it requires no installation and configuration.


1 Answers

Apparently, Cursor.execute doesn't support the 'commit' command. It does support the 'begin' command but this is redundant because sqlite3 begins them for you anway:

>>> import sqlite3
>>> conn = sqlite3.connect(':memory:')
>>> cur = conn.cursor()
>>> cur.execute('begin')
<sqlite3.Cursor object at 0x0104B020>
>>> cur.execute('CREATE TABLE test (id INTEGER)')
<sqlite3.Cursor object at 0x0104B020>
>>> cur.execute('INSERT INTO test VALUES (1)')
<sqlite3.Cursor object at 0x0104B020>
>>> cur.execute('commit')

Traceback (most recent call last):
  File "<pyshell#10>", line 1, in <module>
    cur.execute('commit')
OperationalError: cannot commit - no transaction is active
>>> 

just use the commit method on your Connection object.

As for your second question, it is not absolutely necessary to call begin/commit when merging the files: just be sure that there is absolutely no disk error, modifications to the db's or people looking at the computer the wrong way while it is happening. So begin/commit is probably a good idea. Of course, if the original db's aren't being modified (I honestly haven't looked) then there is no need for that even. If there is an error, you can just scrap the partial output and start over.

It also provides a speedup because every change doesn't have to be written to disk as it occurs. They can be stored in memory and written in bulk. But as mentioned sqlite3 handles this for you.

Also, it's worth mentioning that

cmd = "attach \"%s\" as toMerge" % "b.db"

is wrong in the sense that it's depracated. If you want to do the wrong thing correctly, it's

cmd = 'attach "{0}" as toMerge'.format("b.db") #why not just one string though?

This is forward compatible with newer versions of python which will make porting code easier.

if you want to do the right thing, it's

cmd = "attach ? as toMerge"
cursor.execute(cmd, ('b.db', ))

This avoids sql injection and is, apparently, slightly faster so it's win-win.

You could modify your runCommand method as follows:

def runCommand(self, sql, params=(), commit=True):
    self.cursor.execute(sql, params)
    if commit:
        self.connector.commit()

now you can not commit after every single command by passing commit=False when you don't need a commit. This preserves the notion of transaction.

like image 149
aaronasterling Avatar answered Sep 28 '22 15:09

aaronasterling