Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLITE query performance differ too much to be reliable or understandable

The short version of problem.

Sometimes a query (which uses indices) that usually runs in a second takes a minute or even more to execute. EXPLAIN shows that full scan of table was not made. Database is large (2 Gb, 4.5 mln records) and is used exclusively. Re-running same query again is fast. Till some special moment...

The long version of problem.

I have a log's database on SQLITE:

    CREATE TABLE log( id integer primary key autoincrement,
  msg text,
  created_at int,
  kind text,
  computer text,
  process text,
  who text
  );
    CREATE INDEX idxlog_created_at ON log(created_at);
    CREATE INDEX idxlog_kind_computer_id ON log(kind,computer,id);
    CREATE INDEX idxlog_kind_computer_process_id ON log(kind,computer,process,id);
    CREATE INDEX idxlog_kind_computer_process_who_id ON log(kind,computer,process,who,id);
    CREATE INDEX idxlog_kind_id ON log(kind,id);

kind ===> 'debug', 'error', 'warn', 'info'
computer ===> computer name
process ===> process name
who ===> component name (that sended a message to log)

Indices are created to ensure fast respond to any possible query to a log. id column is included in indices to ensure fast ORDER BY.

Sqlite: 3.7.7.1
Platform: Windows XP
Language: Delphi via sqlite3.dll (from sqlite.org)
Pragmas (on this particular connection):

PRAGMA encoding = "UTF-8"; 
  PRAGMA foreign_keys = ON; 
  PRAGMA synchronous = NORMAL; 
  PRAGMA page_size = 8192; 
  PRAGMA automatic_index = 0;
  PRAGMA temp_store = FILE;

Opening SQLITE connection:

SQLite3_Open_v2(@UTF8String(AFileName)[1], Fdb,
    SQLITE_OPEN_READWRITE or
    SQLITE_OPEN_CREATE or
    SQLITE_OPEN_PRIVATECACHE or
    SQLITE_OPEN_NOMUTEX
  );

An example of a log query:

SELECT 
   1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
FROM log  
WHERE id > 4149245 AND id <= 9223372036854775807 AND kind = 'error' 

UNION ALL

SELECT 
   1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
FROM log  
WHERE id > 4149245 AND id <= 9223372036854775807 AND kind = 'debug' 

UNION ALL

 SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
 FROM log  
 WHERE id > 4149245 AND id <= 9223372036854775807 AND kind = 'timing' 

ORDER BY id DESC LIMIT 100

Each query is a UNION of a query by kind to ensure indices are used.

The database has ~ 4.5 mln records, size of database file ~ 1.8 Gb. Database is used exclusively by logging process (no additional logging was made during these measurements).

Now, the problem.

Sometimes, performance of such a query is about 100 slower than "ordinary" performance.

An ordinary performance is lesser than a second. Now, I've made timings and saved info about long-running queries. Here you are:

    SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
FROM log _computer_process_id  
WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'error' AND computer='KRAFTWAY'  AND process='D:\xxx.exe'  

UNION ALL 

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
FROM log _computer_process_id  
WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'warn' AND computer='KRAFTWAY'  AND process='D:\xxx.exe'  

UNION ALL 

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
FROM log _computer_process_id  
WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'info' AND computer='KRAFTWAY'  AND process='D:\xxx.exe'  

UNION ALL 

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
FROM log _computer_process_id  
WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'debug' AND computer='KRAFTWAY'  AND process='D:\xxx.exe'  

UNION ALL 

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
FROM log _computer_process_id  WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'timing' AND computer='KRAFTWAY'  AND process='D:\xxx.exe'  

ORDER BY id DESC LIMIT 100

plan: 4,0,0,SEARCH TABLE log AS _computer_process_id USING INDEX idxlog_kind_computer_process_id (kind=? AND computer=? AND process=?) (~2 rows) 5,0,0,SEARCH TABLE log AS _computer_process_id USING INDEX idxlog_kind_computer_process_id (kind=? AND computer=? AND process=?) (~2 rows) 3,0,0,COMPOUND SUBQUERIES 4 AND 5 (UNION ALL) 6,0,0,SEARCH TABLE log AS _computer_process_id USING INDEX idxlog_kind_computer_process_id (kind=? AND computer=? AND process=?) (~2 rows) 2,0,0,COMPOUND SUBQUERIES 3 AND 6 (UNION ALL) 7,0,0,SEARCH TABLE log AS _computer_process_id USING INDEX idxlog_kind_computer_process_id (kind=? AND computer=? AND process=?) (~2 rows) 1,0,0,COMPOUND SUBQUERIES 2 AND 7 (UNION ALL) 8,0,0,SEARCH TABLE log AS _computer_process_id USING INDEX idxlog_kind_computer_process_id (kind=? AND computer=? AND process=?) (~2 rows) 0,0,0,COMPOUND SUBQUERIES 1 AND 8 (UNION ALL)

61326 ms

  SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_id  WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'error' AND computer='KRAFTWAY'  

UNION ALL 

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_id  WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'warn' AND computer='KRAFTWAY'  

UNION ALL 

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_id  WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'debug' AND computer='KRAFTWAY'  

UNION ALL 

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_id  WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'timing' AND computer='KRAFTWAY'  ORDER BY id DESC LIMIT 100

plan: 3,0,0,SEARCH TABLE log AS _computer_id USING INDEX idxlog_kind_computer_id (kind=? AND computer=?) (~2 rows) 4,0,0,SEARCH TABLE log AS _computer_id USING INDEX idxlog_kind_computer_id (kind=? AND computer=?) (~2 rows) 2,0,0,COMPOUND SUBQUERIES 3 AND 4 (UNION ALL) 5,0,0,SEARCH TABLE log AS _computer_id USING INDEX idxlog_kind_computer_id (kind=? AND computer=?) (~2 rows) 1,0,0,COMPOUND SUBQUERIES 2 AND 5 (UNION ALL) 6,0,0,SEARCH TABLE log AS _computer_id USING INDEX idxlog_kind_computer_id (kind=? AND computer=?) (~2 rows) 0,0,0,COMPOUND SUBQUERIES 1 AND 6 (UNION ALL)

45643 ms

     SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_process_who_id  WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'error' AND computer='KRAFTWAY'  AND process='D:\xxx.exe'  AND who='main(TMessageRouter)'  

UNION ALL 

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_process_who_id  WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'warn' AND computer='KRAFTWAY'  AND process='D:\xxx.exe'  AND who='main(TMessageRouter)'  

UNION ALL 

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_process_who_id  WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'info' AND computer='KRAFTWAY'  AND process='D:\xxx.exe'  AND who='main(TMessageRouter)'  

UNION ALL 

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_process_who_id  WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'debug' AND computer='KRAFTWAY'  AND process='D:\xxx.exe'  AND who='main(TMessageRouter)'  

UNION ALL 

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log _computer_process_who_id  WHERE id > 4149246 AND id <= 9223372036854775807 AND kind = 'timing' AND computer='KRAFTWAY'  AND process='D:\xxx.exe'  AND who='main(TMessageRouter)'  

ORDER BY id DESC LIMIT 100

plan: 4,0,0,SEARCH TABLE log AS _computer_process_who_id USING INDEX idxlog_kind_computer_process_who_id (kind=? AND computer=? AND process=? AND who=?) (~2 rows) 5,0,0,SEARCH TABLE log AS _computer_process_who_id USING INDEX idxlog_kind_computer_process_who_id (kind=? AND computer=? AND process=? AND who=?) (~2 rows) 3,0,0,COMPOUND SUBQUERIES 4 AND 5 (UNION ALL) 6,0,0,SEARCH TABLE log AS _computer_process_who_id USING INDEX idxlog_kind_computer_process_who_id (kind=? AND computer=? AND process=? AND who=?) (~2 rows) 2,0,0,COMPOUND SUBQUERIES 3 AND 6 (UNION ALL) 7,0,0,SEARCH TABLE log AS _computer_process_who_id USING INDEX idxlog_kind_computer_process_who_id (kind=? AND computer=? AND process=? AND who=?) (~2 rows) 1,0,0,COMPOUND SUBQUERIES 2 AND 7 (UNION ALL) 8,0,0,SEARCH TABLE log AS _computer_process_who_id USING INDEX idxlog_kind_computer_process_who_id (kind=? AND computer=? AND process=? AND who=?) (~2 rows) 0,0,0,COMPOUND SUBQUERIES 1 AND 8 (UNION ALL)

175899 ms

Yeah, this query was executing 3 minutes.

See the plan after each query? That's right, after each too long-running query I've executed "EXPLAIN QUERY PLAN " + SQL of query and saved the result. One can clearly see no SCAN in the plan. This means no FULL TABLE scan is made, indices are used and everything is okay.

Yeah, everything is okay except the time of query.

Monitoring logging process during such a long-running query shows constant disk i/o at about 1 Mb per second rate (which is obviously not a bottleneck, HDD can do 100 Mb per second) and about-zero CPU activity. If paused under debugger long-running thread is deep inside sqlite.dll ending at ntdll.ZwReadFile.

I've tried and defragged everything. It runs faster but the problem remains.

And if I re-run the same query it executes almost immediately.

Same database, same query. First run - 160 seconds, second run - 2 seconds.

Another interesting fact: longest running queries often returns no rows.

Yeah, I understand caching and OS is keeping some pages in memory.

That's all right with me. I don't understand why SQLITE is reading so slow if it needs pages? Why reading at 1 mb per second rate? Is data is scattered too much inside SQLITE database so there's lots of seeking? Then why? Indices are used to ensure this is not the problem, isn't it?

I am completely clueless and appreciate any help.

UPDATE.

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4033214 AND id <= 9223372036854775807 AND kind = 'error' AND computer='KRAFTWAY' AND process='D:\xxx.exe'

UNION ALL

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4033214 AND id <= 9223372036854775807 AND kind = 'warn' AND computer='KRAFTWAY' AND process='D:\xxx.exe'

UNION ALL

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4033214 AND id <= 9223372036854775807 AND kind = 'info' AND computer='KRAFTWAY' AND process='DD:\xxx.exe'

UNION ALL

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4033214 AND id <= 9223372036854775807 AND kind = 'debug' AND computer='KRAFTWAY' AND process='D:\xxx.exe'

UNION ALL

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4033214 AND id <= 9223372036854775807 AND kind = 'timing' AND computer='KRAFTWAY' AND process='D:\xxx.exe'

ORDER BY id DESC LIMIT 100

plan: 4,0,0,SEARCH TABLE log USING COVERING INDEX idxlog_kind_computer_process_id_who_msg_created_at (kind=? AND computer=? AND process=?) (~2 rows) 5,0,0,SEARCH TABLE log USING COVERING INDEX idxlog_kind_computer_process_id_who_msg_created_at (kind=? AND computer=? AND process=?) (~2 rows) 3,0,0,COMPOUND SUBQUERIES 4 AND 5 (UNION ALL) 6,0,0,SEARCH TABLE log USING COVERING INDEX idxlog_kind_computer_process_id_who_msg_created_at (kind=? AND computer=? AND process=?) (~2 rows) 2,0,0,COMPOUND SUBQUERIES 3 AND 6 (UNION ALL) 7,0,0,SEARCH TABLE log USING COVERING INDEX idxlog_kind_computer_process_id_who_msg_created_at (kind=? AND computer=? AND process=?) (~2 rows) 1,0,0,COMPOUND SUBQUERIES 2 AND 7 (UNION ALL) 8,0,0,SEARCH TABLE log USING COVERING INDEX idxlog_kind_computer_process_id_who_msg_created_at (kind=? AND computer=? AND process=?) (~2 rows) 0,0,0,COMPOUND SUBQUERIES 1 AND 8 (UNION ALL)

1138669 ms It's ~ 18 minutes.

Re-running query: less than 100 ms.

During those 18 minutes db process was reading at about 500 Kb per second rate.

like image 488
Gregory Avatar asked Nov 13 '22 19:11

Gregory


1 Answers

I assume you get an OK from

PRAGMA integrity_check

Are you actually setting exclusive locking?

PRAGMA locking_mode = EXCLUSIVE

Can you reproduce the problem on a smaller database with the same schema and queries, just less data?

like image 150
zeFrenchy Avatar answered Dec 10 '22 19:12

zeFrenchy