Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

bitwise mask vs IN() efficiency in sqlite?

I have two ways to select a set of entries from the database:

  SELECT ... WHERE `level` IN (1,2,4,8) LIMIT ...;  

or

  SELECT ... WHERE `level` & mask LIMIT ...;

There are 4 'levels' total, numbered 1,2,4,8 (for reasons of ability to use the same mask elsewhere too). Both the braces of IN() or the mask can contain any set of one or more of the 4 levels. The column is indexed. The query is still taking way longer than comfortable, and we're trying to optimize it for speed.

Yesterday one person said decided using naive IN() results in up to four comparisons and that I should be using a bit mask instead. Today I heard the bit mask will completely thwart advantages from index on the column, and will be much slower.

Can you tell me which approach will be faster?

like image 404
SF. Avatar asked Mar 04 '11 11:03

SF.


1 Answers

Your question is quite old but I'm still gonna answer it nonetheless.

The bitmask will most probably be slower as it has to work out the computation of the bitwise AND whereas IN will use the indexed value of level to look it up in the arguments enclosed within the parentheses (which, I believe, should be a single O(log(n)) operation).

Now, the thing that you may be missing, is that they don't do the same thing.

Your first query will simply check if level is either 1, 2, 4 or 8.

Your second query, or actually something like:

SELECT ... WHERE (`level` & mask) = mask LIMIT ...;

Has the ability to lookup levels that contain the mask you want and potentially some more, in your case it could be checking all value combinations between 1 and 15. Hence the performance hit.


As for the bruteforced benchmark @AlanFoster suggested, I don't agree with him.

It's far better to prefix the query with either:

  • EXPLAIN, or
  • EXPLAIN QUERY PLAN

And inspect how many rows SQLite is matching.


Update

EXPLAIN QUERY PLAN SELECT * FROM ... WHERE level IN (2, 3);

SEARCH TABLE ... USING INDEX ..._level (level=?) (~20 rows)

EXPLAIN QUERY PLAN SELECT * FROM ... WHERE (level & 2) = 2;

SCAN TABLE ... (~500000 rows)

As you can see, the bitwise AND operator needs a full-table scan.

like image 139
Alix Axel Avatar answered Sep 18 '22 17:09

Alix Axel