Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bitwise operations in Postgres

I have the following tables:

types | id | name
------+----+----------
         1 | A
         2 | B
         4 | C
         8 | D
         16| E
         32| F

and

vendors | id | name     | type
--------+----+----------+-----
           1 | Alex     | 2     //type B only
           2 | Bob      | 5     //A,C
           3 | Cheryl   | 32    //F
           4 | David    | 43    //F,D,A,B
           5 | Ed       | 15    //A,B,C,D
           6 | Felix    | 8     //D
           7 | Gopal    | 4     //C
           8 | Herry    | 9     //A,D
           9 | Iris     | 7     //A,B,C
           10| Jack     | 23    //A,B,C,E

I would like to query now:

select id, name from vendors where type & 16 >0 //should return Jack as he is type E
select id, name from vendors where type & 7 >0 //should return Ed, Iris, Jack
select id, name from vendors where type & 8 >0 //should return David, Ed, Felix, Herry 

What is the best possible index for tables types and vendors in postgres? I may have millions of rows in vendors. Moreover, what are the tradeoffs of using this bitwise method compared with Many To Many relation using a 3rd table? Which is better?

like image 924
jerrymouse Avatar asked Feb 10 '12 10:02

jerrymouse


1 Answers

Use can use partial indices to work around the fact that "&" isn't an indexable operator (afaik):

CREATE INDEX vendors_typeA ON vendors(id) WHERE (type & 2) > 0;
CREATE INDEX vendors_typeB ON vendors(id) WHERE (type & 4) > 0;

Of course, you'll need to add a new index every time you add a new type. Which is one of the reasons for expanding the data into an association table which can then be indexed properly. You can always write triggers to maintain a bitmask table additionally, but use the many-to-many table to actually maintain the data normally, as it will be much clearer.

If your entire evaluation of scaling and performance is to say "I may have millions of rows", you haven't done enough to start going for this sort of optimisation. Create a properly-structured clear model first, optimise it later on the basis of real statistics about how it performs.

like image 182
araqnid Avatar answered Oct 01 '22 23:10

araqnid