At first, some background:
My Android app has DB table with a lot of four-column rows. It sends requests to the server and server responds only when all of these four values are "valid". A few of the thousands users reported that something doesn't work for them (since awhile they are not getting the results from the server) - I was trying to figure out what's causing the problem and it turned out that the only possible cause is the DB corruption that's not being detected.
In ACRA logs I've got some messages with SQL errors, but these were about application not being able to open the file because of it being corrupt. That gave me some clue, but I was still not convinced that this is the issue. So, I created a very simple Python script which changes random bytes in the DB file and checks how SQLite will deal with that:
import random
import array
import sqlite3
db = array.array('B')
db.fromstring(open('db').read())
ta = [x for x in sqlite3.connect('db').execute('SELECT * FROM table ORDER BY _id')]
results = [0,0,0,0]
tries = 1000
for i in xrange(0,tries):
work = db[:]
while work == db:
for j in xrange(0,random.randint(1,5)):
work[random.randint(1,len(db))-1] = random.randint(0,255)
work.tofile(open('outdb','w'))
try:
c = sqlite3.connect('outdb')
results[0] += 1
for r in c.execute('PRAGMA integrity_check;'):
results[1] += 1 if (r[0] == 'ok') else 0
except:
continue
try:
results[3] += 1 if [x for x in c.execute('SELECT * FROM table ORDER BY _id')] != ta else 0
results[2] += 1
except:
c.close()
continue
print 'Results for '+str(tries)+' tests:'
print 'Creating connection failed '+str(tries-results[0])+ ' times'
print 'Integrity check failed '+str(results[0]-results[1])+ ' times'
print 'Running a SELECT * query failed '+str(results[1]-results[2])+ ' times'
print 'Data was succesfully altered '+str(results[3])+ ' times'
The results showed that "editing" table data in this way is entirely possible:
Results for 1000 tests:
Creating connection failed 0 times
Integrity check failed 503 times
Running a SELECT * query failed 289 times
Data was succesfully altered 193 times
It's generally interesting to see that running a query failed for half of the modifications that went undetected by integrity check, but the most interesting thing for me is that something may swap random bytes in my DB rendering my application useless for a part of my users.
I've read about possible causes of corruption on SQLite website and also on StackOverflow, I know that e.g. forcing application to close may do a harm to the DB. I'd just like to know if it's possible to implement a fast and more robust DB integrity check.
I'm reading the data from a one column of the whole table at startup (for autocompletion), so I thought of calculating some hash from the all values - I think this would work quite good, since some hash function are designed just for doing integrity checks, but maybe there's a simpler, faster and better solution - I'm thus asking you, if you know any.
I don't know of any SQLite feature like this, so I'd say that calculating a hash is the simplest solution, take a look at the MessageDigest
class for a start.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With