Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to improve sqlite like statement performance

I create a table use such schema:

CREATE TABLE wordIndex(id integer primary key, word varchar(128), offset integer, length integer);
CREATE INDEX word_idx on wordIndex(word);

Now the table have about 450,000 row records.When I use Like statement Below on ipod4, the performance is not good: select * from wordIndex where word like 'test acces%'; Use explain output:

explain select * from wordIndex where word like 'test acces%';
0|Trace|0|0|0||00|
1|Goto|0|16|0||00|
2|OpenRead|0|2|0|4|00|
3|Rewind|0|14|0||00|
4|String8|0|2|0|test acces%|00|
5|Column|0|1|3||00|
6|Function|1|2|1|like(2)|02|
7|IfNot|1|13|1||00|
8|Rowid|0|4|0||00|
9|Column|0|1|5||00|
10|Column|0|2|6||00|
11|Column|0|3|7||00|
12|ResultRow|4|4|0||00|
13|Next|0|4|0||01|
14|Close|0|0|0||00|
15|Halt|0|0|0||00|
16|Transaction|0|0|0||00|
17|VerifyCookie|0|2|0||00|
18|TableLock|0|2|0|wordIndex|00|
19|Goto|0|2|0||00|

May be I need build an additional inverted index to improve the performance or ...? Thanks advance!

like image 506
ericfang Avatar asked Dec 27 '12 15:12

ericfang


3 Answers

Indexes and like don't get along in most databases. The best bet is to rewrite the query as a range query, if possible, because the index will then be used:

select *
from wordIndex
where word between 'test acces' and 'test acces{'

(The open brace is the ASCII character immediately following 'z'.)

If you are looking for patterns at the beginning of a word (say '%test'), then you may have to resign yourself to a full table scan.

EDIT:

Indexes and like *do` get along nowadays in most databases when the patterns starts with a constant, so you can do:

select *
from wordIndex
where word like 'test acces%' ;

I'm not 100% sure about SQLite, though, so check the execution plan to see if it uses the index.

like image 80
Gordon Linoff Avatar answered Nov 17 '22 03:11

Gordon Linoff


Try this:

SELECT * FROM wordIndex
WHERE word COLLATE NOCASE BETWEEN @SearchString AND @SearchString || '~~~~~~~~~~'

"~" is the biggest ASCII symbol.

like image 20
Hamlet Hakobyan Avatar answered Nov 17 '22 02:11

Hamlet Hakobyan


I will have a slightly different answer than Gordon Linoff, but with the same approach.

If you want to keep any characters that follow 'test acces' you should try this:

SELECT * FROM wordIndex
WHERE word > 'test acces' AND word < 'test accet';
like image 2
Stephane D. Avatar answered Nov 17 '22 04:11

Stephane D.