Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

fast check if intersection set empty (false positives is ok)

I need to do a lot of checks if the intersection set of two sets (one is identical for all checks, the other one changes) is empty or not.

It's ok if the check says (in a small amount of checks) it's not empty, but it is (there can be a second filtering step which is more precise), so false positives are ok. It's not allowed, that I filter out something that defnitly has a not empty intersections, so false negatives are not ok.

So, only a scenario:

{A,B,C,D} <-> {D,E,F} => true (D in intersection set), never allowed to be false

{A,B,C} <-> {D,E,F} => false (no intersection set), can also return true in a minor number of checks

For a single element I would use a bloom filter, but I can not find something similar for a set of elements and bloom filter checking element by element would be a possible option, but I'm searching for something better.

like image 665
Peter V. Avatar asked Oct 29 '22 22:10

Peter V.


1 Answers

Thanks a lot for your answers, helped me to come up with a good solution and solve the problem.

The idea is mostly really primitive, but sufficient.

I create two bitsets, one for the changing set and one for the fixed set. Each element of a set is hashed to one bit (so e.g. for a long one bit in 1 to 64) and then combined for a set (mostly a Bloom-Bitset with k=1).

To check if there is a non empty intersection set I simply need to combine the two bitsets with bit-and-operation and check if the result is not 0.

The false-positive rate would be (I think, didn't do the math) worse, but it's good enough for my case.

Example:

[A,B,C] => 0000100010100000

[B,D,F] => 0100000010000100

---------------------- &

0000000010000000 != 0 => true

like image 104
Peter V. Avatar answered Nov 01 '22 10:11

Peter V.