The short version of problem.
Sometimes a query (which uses indices) that usually runs in a second takes a minute or even more to execute. EXPLAIN shows that full scan of table was not made. Database is large (2 Gb, 4.5 mln records) and is used exclusively. Re-running same query again is fast. Till some special moment...
The long version of problem.
I have a log's database on SQLITE:
CREATE TABLE log( id integer primary key autoincrement,
msg text,
created_at int,
kind text,
computer text,
process text,
who text
);
CREATE INDEX idxlog_created_at ON log(created_at);
CREATE INDEX idxlog_kind_computer_id ON log(kind,computer,id);
CREATE INDEX idxlog_kind_computer_process_id ON log(kind,computer,process,id);
CREATE INDEX idxlog_kind_computer_process_who_id ON log(kind,computer,process,who,id);
CREATE INDEX idxlog_kind_id ON log(kind,id);
kind ===> 'debug', 'error', 'warn', 'info'
computer ===> computer name
process ===> process name
who ===> component name (that sended a message to log)
Indices are created to ensure fast respond to any possible query to a log.
id column is included in indices to ensure fast ORDER BY.
Sqlite: 3.7.7.1
Platform: Windows XP
Language: Delphi via sqlite3.dll (from sqlite.org)
Pragmas (on this particular connection):
PRAGMA encoding = "UTF-8";
PRAGMA foreign_keys = ON;
PRAGMA synchronous = NORMAL;
PRAGMA page_size = 8192;
PRAGMA automatic_index = 0;
PRAGMA temp_store = FILE;
Opening SQLITE connection:
SQLite3_Open_v2(@UTF8String(AFileName)[1], Fdb,
SQLITE_OPEN_READWRITE or
SQLITE_OPEN_CREATE or
SQLITE_OPEN_PRIVATECACHE or
SQLITE_OPEN_NOMUTEX
);
An example of a log query:
SELECT
1 as today, id as rowid, kind,who,msg,computer,process,created_at,id
FROM log
WHERE id > 4149245 AND id <= 9223372036854775807 AND kind = 'error'
UNION ALL
SELECT
1 as today, id as rowid, kind,who,msg,computer,process,created_at,id
FROM log
WHERE id > 4149245 AND id <= 9223372036854775807 AND kind = 'debug'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id
FROM log
WHERE id > 4149245 AND id <= 9223372036854775807 AND kind = 'timing'
ORDER BY id DESC LIMIT 100
Each query is a UNION of a query by kind to ensure indices are used.
The database has ~ 4.5 mln records, size of database file ~ 1.8 Gb. Database is used exclusively by logging process (no additional logging was made during these measurements).
Now, the problem.
Sometimes, performance of such a query is about 100 slower than "ordinary" performance.
An ordinary performance is lesser than a second. Now, I've made timings and saved info about long-running queries. Here you are:
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id
FROM log _computer_process_id
WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'error' AND computer='KRAFTWAY' AND process='D:\xxx.exe'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id
FROM log _computer_process_id
WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'warn' AND computer='KRAFTWAY' AND process='D:\xxx.exe'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id
FROM log _computer_process_id
WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'info' AND computer='KRAFTWAY' AND process='D:\xxx.exe'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id
FROM log _computer_process_id
WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'debug' AND computer='KRAFTWAY' AND process='D:\xxx.exe'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id
FROM log _computer_process_id WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'timing' AND computer='KRAFTWAY' AND process='D:\xxx.exe'
ORDER BY id DESC LIMIT 100
plan: 4,0,0,SEARCH TABLE log AS _computer_process_id USING INDEX idxlog_kind_computer_process_id (kind=? AND computer=? AND process=?) (~2 rows) 5,0,0,SEARCH TABLE log AS _computer_process_id USING INDEX idxlog_kind_computer_process_id (kind=? AND computer=? AND process=?) (~2 rows) 3,0,0,COMPOUND SUBQUERIES 4 AND 5 (UNION ALL) 6,0,0,SEARCH TABLE log AS _computer_process_id USING INDEX idxlog_kind_computer_process_id (kind=? AND computer=? AND process=?) (~2 rows) 2,0,0,COMPOUND SUBQUERIES 3 AND 6 (UNION ALL) 7,0,0,SEARCH TABLE log AS _computer_process_id USING INDEX idxlog_kind_computer_process_id (kind=? AND computer=? AND process=?) (~2 rows) 1,0,0,COMPOUND SUBQUERIES 2 AND 7 (UNION ALL) 8,0,0,SEARCH TABLE log AS _computer_process_id USING INDEX idxlog_kind_computer_process_id (kind=? AND computer=? AND process=?) (~2 rows) 0,0,0,COMPOUND SUBQUERIES 1 AND 8 (UNION ALL)
61326 ms
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_id WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'error' AND computer='KRAFTWAY'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_id WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'warn' AND computer='KRAFTWAY'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_id WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'debug' AND computer='KRAFTWAY'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_id WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'timing' AND computer='KRAFTWAY' ORDER BY id DESC LIMIT 100
plan: 3,0,0,SEARCH TABLE log AS _computer_id USING INDEX idxlog_kind_computer_id (kind=? AND computer=?) (~2 rows) 4,0,0,SEARCH TABLE log AS _computer_id USING INDEX idxlog_kind_computer_id (kind=? AND computer=?) (~2 rows) 2,0,0,COMPOUND SUBQUERIES 3 AND 4 (UNION ALL) 5,0,0,SEARCH TABLE log AS _computer_id USING INDEX idxlog_kind_computer_id (kind=? AND computer=?) (~2 rows) 1,0,0,COMPOUND SUBQUERIES 2 AND 5 (UNION ALL) 6,0,0,SEARCH TABLE log AS _computer_id USING INDEX idxlog_kind_computer_id (kind=? AND computer=?) (~2 rows) 0,0,0,COMPOUND SUBQUERIES 1 AND 6 (UNION ALL)
45643 ms
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_process_who_id WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'error' AND computer='KRAFTWAY' AND process='D:\xxx.exe' AND who='main(TMessageRouter)'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_process_who_id WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'warn' AND computer='KRAFTWAY' AND process='D:\xxx.exe' AND who='main(TMessageRouter)'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_process_who_id WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'info' AND computer='KRAFTWAY' AND process='D:\xxx.exe' AND who='main(TMessageRouter)'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_process_who_id WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'debug' AND computer='KRAFTWAY' AND process='D:\xxx.exe' AND who='main(TMessageRouter)'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_process_who_id WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'timing' AND computer='KRAFTWAY' AND process='D:\xxx.exe' AND who='main(TMessageRouter)'
ORDER BY id DESC LIMIT 100
plan: 4,0,0,SEARCH TABLE log AS _computer_process_who_id USING INDEX idxlog_kind_computer_process_who_id (kind=? AND computer=? AND process=? AND who=?) (~2 rows) 5,0,0,SEARCH TABLE log AS _computer_process_who_id USING INDEX idxlog_kind_computer_process_who_id (kind=? AND computer=? AND process=? AND who=?) (~2 rows) 3,0,0,COMPOUND SUBQUERIES 4 AND 5 (UNION ALL) 6,0,0,SEARCH TABLE log AS _computer_process_who_id USING INDEX idxlog_kind_computer_process_who_id (kind=? AND computer=? AND process=? AND who=?) (~2 rows) 2,0,0,COMPOUND SUBQUERIES 3 AND 6 (UNION ALL) 7,0,0,SEARCH TABLE log AS _computer_process_who_id USING INDEX idxlog_kind_computer_process_who_id (kind=? AND computer=? AND process=? AND who=?) (~2 rows) 1,0,0,COMPOUND SUBQUERIES 2 AND 7 (UNION ALL) 8,0,0,SEARCH TABLE log AS _computer_process_who_id USING INDEX idxlog_kind_computer_process_who_id (kind=? AND computer=? AND process=? AND who=?) (~2 rows) 0,0,0,COMPOUND SUBQUERIES 1 AND 8 (UNION ALL)
175899 ms
Yeah, this query was executing 3 minutes.
See the plan after each query? That's right, after each too long-running query I've executed "EXPLAIN QUERY PLAN " + SQL of query and saved the result. One can clearly see no SCAN in the plan. This means no FULL TABLE scan is made, indices are used and everything is okay.
Yeah, everything is okay except the time of query.
Monitoring logging process during such a long-running query shows constant disk i/o at about 1 Mb per second rate (which is obviously not a bottleneck, HDD can do 100 Mb per second) and about-zero CPU activity. If paused under debugger long-running thread is deep inside sqlite.dll ending at ntdll.ZwReadFile.
I've tried and defragged everything. It runs faster but the problem remains.
And if I re-run the same query it executes almost immediately.
Same database, same query. First run - 160 seconds, second run - 2 seconds.
Another interesting fact: longest running queries often returns no rows.
Yeah, I understand caching and OS is keeping some pages in memory.
That's all right with me. I don't understand why SQLITE is reading so slow if it needs pages? Why reading at 1 mb per second rate? Is data is scattered too much inside SQLITE database so there's lots of seeking? Then why? Indices are used to ensure this is not the problem, isn't it?
I am completely clueless and appreciate any help.
UPDATE.
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4033214 AND id <= 9223372036854775807 AND kind = 'error' AND computer='KRAFTWAY' AND process='D:\xxx.exe'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4033214 AND id <= 9223372036854775807 AND kind = 'warn' AND computer='KRAFTWAY' AND process='D:\xxx.exe'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4033214 AND id <= 9223372036854775807 AND kind = 'info' AND computer='KRAFTWAY' AND process='DD:\xxx.exe'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4033214 AND id <= 9223372036854775807 AND kind = 'debug' AND computer='KRAFTWAY' AND process='D:\xxx.exe'
UNION ALL
SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4033214 AND id <= 9223372036854775807 AND kind = 'timing' AND computer='KRAFTWAY' AND process='D:\xxx.exe'
ORDER BY id DESC LIMIT 100
plan: 4,0,0,SEARCH TABLE log USING COVERING INDEX idxlog_kind_computer_process_id_who_msg_created_at (kind=? AND computer=? AND process=?) (~2 rows) 5,0,0,SEARCH TABLE log USING COVERING INDEX idxlog_kind_computer_process_id_who_msg_created_at (kind=? AND computer=? AND process=?) (~2 rows) 3,0,0,COMPOUND SUBQUERIES 4 AND 5 (UNION ALL) 6,0,0,SEARCH TABLE log USING COVERING INDEX idxlog_kind_computer_process_id_who_msg_created_at (kind=? AND computer=? AND process=?) (~2 rows) 2,0,0,COMPOUND SUBQUERIES 3 AND 6 (UNION ALL) 7,0,0,SEARCH TABLE log USING COVERING INDEX idxlog_kind_computer_process_id_who_msg_created_at (kind=? AND computer=? AND process=?) (~2 rows) 1,0,0,COMPOUND SUBQUERIES 2 AND 7 (UNION ALL) 8,0,0,SEARCH TABLE log USING COVERING INDEX idxlog_kind_computer_process_id_who_msg_created_at (kind=? AND computer=? AND process=?) (~2 rows) 0,0,0,COMPOUND SUBQUERIES 1 AND 8 (UNION ALL)
1138669 ms It's ~ 18 minutes.
Re-running query: less than 100 ms.
During those 18 minutes db process was reading at about 500 Kb per second rate.
I assume you get an OK from
PRAGMA integrity_check
Are you actually setting exclusive locking?
PRAGMA locking_mode = EXCLUSIVE
Can you reproduce the problem on a smaller database with the same schema and queries, just less data?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With