Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Better SQLite corruption detection

At first, some background:

My Android app has DB table with a lot of four-column rows. It sends requests to the server and server responds only when all of these four values are "valid". A few of the thousands users reported that something doesn't work for them (since awhile they are not getting the results from the server) - I was trying to figure out what's causing the problem and it turned out that the only possible cause is the DB corruption that's not being detected.

In ACRA logs I've got some messages with SQL errors, but these were about application not being able to open the file because of it being corrupt. That gave me some clue, but I was still not convinced that this is the issue. So, I created a very simple Python script which changes random bytes in the DB file and checks how SQLite will deal with that:

import random
import array
import sqlite3

db = array.array('B')
db.fromstring(open('db').read())

ta =  [x for x in sqlite3.connect('db').execute('SELECT * FROM table ORDER BY _id')]

results = [0,0,0,0]
tries = 1000

for i in xrange(0,tries):
    work = db[:]
    while work == db: 
        for j in xrange(0,random.randint(1,5)):
            work[random.randint(1,len(db))-1] = random.randint(0,255)

    work.tofile(open('outdb','w'))

    try:
        c = sqlite3.connect('outdb')
        results[0] += 1

        for r in c.execute('PRAGMA integrity_check;'):
        results[1] += 1 if (r[0] == 'ok') else 0 
    except:
        continue    

    try:
        results[3] += 1 if [x for x in c.execute('SELECT * FROM table ORDER BY _id')] != ta else 0
        results[2] += 1
    except:
        c.close()
        continue

print 'Results for '+str(tries)+' tests:'
print 'Creating connection failed '+str(tries-results[0])+ ' times'
print 'Integrity check failed '+str(results[0]-results[1])+ ' times'
print 'Running a SELECT * query failed '+str(results[1]-results[2])+ ' times'
print 'Data was succesfully altered '+str(results[3])+ ' times'

The results showed that "editing" table data in this way is entirely possible:

Results for 1000 tests:
Creating connection failed 0 times
Integrity check failed 503 times
Running a SELECT * query failed 289 times
Data was succesfully altered 193 times

It's generally interesting to see that running a query failed for half of the modifications that went undetected by integrity check, but the most interesting thing for me is that something may swap random bytes in my DB rendering my application useless for a part of my users.

I've read about possible causes of corruption on SQLite website and also on StackOverflow, I know that e.g. forcing application to close may do a harm to the DB. I'd just like to know if it's possible to implement a fast and more robust DB integrity check.

I'm reading the data from a one column of the whole table at startup (for autocompletion), so I thought of calculating some hash from the all values - I think this would work quite good, since some hash function are designed just for doing integrity checks, but maybe there's a simpler, faster and better solution - I'm thus asking you, if you know any.

like image 889
user1234567 Avatar asked Jan 23 '12 14:01

user1234567


1 Answers

I don't know of any SQLite feature like this, so I'd say that calculating a hash is the simplest solution, take a look at the MessageDigest class for a start.

like image 135
dnet Avatar answered Sep 18 '22 17:09

dnet