I have the following tables:
types | id | name
------+----+----------
1 | A
2 | B
4 | C
8 | D
16| E
32| F
and
vendors | id | name | type
--------+----+----------+-----
1 | Alex | 2 //type B only
2 | Bob | 5 //A,C
3 | Cheryl | 32 //F
4 | David | 43 //F,D,A,B
5 | Ed | 15 //A,B,C,D
6 | Felix | 8 //D
7 | Gopal | 4 //C
8 | Herry | 9 //A,D
9 | Iris | 7 //A,B,C
10| Jack | 23 //A,B,C,E
I would like to query now:
select id, name from vendors where type & 16 >0 //should return Jack as he is type E
select id, name from vendors where type & 7 >0 //should return Ed, Iris, Jack
select id, name from vendors where type & 8 >0 //should return David, Ed, Felix, Herry
What is the best possible index for tables types
and vendors
in postgres? I may have millions of rows in vendors. Moreover, what are the tradeoffs of using this bitwise method compared with Many To Many relation using a 3rd table? Which is better?
Use can use partial indices to work around the fact that "&" isn't an indexable operator (afaik):
CREATE INDEX vendors_typeA ON vendors(id) WHERE (type & 2) > 0;
CREATE INDEX vendors_typeB ON vendors(id) WHERE (type & 4) > 0;
Of course, you'll need to add a new index every time you add a new type. Which is one of the reasons for expanding the data into an association table which can then be indexed properly. You can always write triggers to maintain a bitmask table additionally, but use the many-to-many table to actually maintain the data normally, as it will be much clearer.
If your entire evaluation of scaling and performance is to say "I may have millions of rows", you haven't done enough to start going for this sort of optimisation. Create a properly-structured clear model first, optimise it later on the basis of real statistics about how it performs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With