Is there a way to index blob fields and have the index used for beginning of blob searches?
Currently I have hashes stored as hexadecimal in text fields. These hashes in hexadecimal form are 32 characters long, and form the bulk of the data in the database.
Problem is, they are often searched by their starting bytes, as in
select * from mytable where hash like '00a1b2%'
I would like to store them as blobs, as this saves about 30% of the database size. However while
select * from mytable where hex(hash) like '00a1b2%'
works, it's also much slower and does not seem to use the index.
Searching for exact blob matches does use the index, so the index is working.
Is there a way to perform a search on a blob start (with binary/memcmp "collation") that would use the index?
I also tried substr(), it's apparently faster than hex() but still not indexed
select * from mytable where substr(hash, 1, 6) = x'00a1b2'
To be able to use an index for LIKE, the table column must have TEXT affinity, and the index must be case insensitive:
CREATE TABLE mytable(... hash TEXT, ...);
CREATE INDEX hash_index ON mytable(hash COLLATE NOCASE);
Functions like hex
or substr
prevent usage of indexes.
Blobs can be indexed and compared like other types. This allows you to express a prefix search with two comparisons:
SELECT * FROM mytable WHERE hash >= x'00a1b2' AND hash < x'00a1b3'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With