Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you efficiently trim an SQlite database down to a given file size?

I'm using SQLite 3.7.2 on Windows. My database is used to store log data which gets generated 24/7. The schema is basically:

CREATE TABLE log_message(id INTEGER PRIMARY KEY AUTOINCREMENT, process_id INTEGER, text TEXT);
CREATE TABLE process(id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT);

The log_message.process_id field maps to process.id, thus associating each log message with the process it originates from.

Now, sooner or later the database becomes too large and I'd like to drop the oldest entries (those with the lowest log_message.id values) until the database fell to a given size again (say, 1GB). To do so, I'm currently doing

PRAGMA page_count;
PRAGMA page_size;

after each few log messages to get the size of the database. If it exceeds my limit, I just remove a fraction (right now: 100 messages) of the log messages like this:

BEGIN TRANSACTION;
DELETE FROM log_message WHERE id IN (SELECT id FROM log_message LIMIT 100);
DELETE FROM process WHERE id IN (SELECT id FROM PROCESS EXCEPT SELECT process_id FROM log_message);
COMMIT;
VACUUM;

The latter DELETE statement removes all unreferenced entries from the process table. I repeat this process until the file size is acceptable again.

This suffers from at least two issues:

  1. The approach of removing 100 log messages is quite random; I made that number up based on a few experiments. I'd like to know the number of entries I have to remove in advance.
  2. The repeated VACUUM calls can take up quite some time (the SQLite home page says that VACUUM can take up to half a second per MB on Linux, I guess it's not going to be any faster on Windows).

Does anybody have other suggestions on how to do this?

like image 516
Frerich Raabe Avatar asked May 13 '11 03:05

Frerich Raabe


1 Answers

when you have a "right-sized" database then count the number of log_message rows.

SELECT COUNT(*) FROM LOG_MESSAGE

Store this number.

When you want shrink the file issue the count command again. Calculate the difference, delete that number of rows from your database, then VACCUM.

This can only be approximate but it will get you to near 1GB pretty quick. If you are still over you can go back to the 100 rows at a time method.

like image 74
James Anderson Avatar answered Oct 22 '22 04:10

James Anderson