I need to do a lot of checks if the intersection set of two sets (one is identical for all checks, the other one changes) is empty or not.
It's ok if the check says (in a small amount of checks) it's not empty, but it is (there can be a second filtering step which is more precise), so false positives are ok. It's not allowed, that I filter out something that defnitly has a not empty intersections, so false negatives are not ok.
So, only a scenario:
{A,B,C,D} <-> {D,E,F} => true
(D in intersection set), never allowed to be false
{A,B,C} <-> {D,E,F} => false
(no intersection set), can also return true in a minor number of checks
For a single element I would use a bloom filter, but I can not find something similar for a set of elements and bloom filter checking element by element would be a possible option, but I'm searching for something better.
Thanks a lot for your answers, helped me to come up with a good solution and solve the problem.
The idea is mostly really primitive, but sufficient.
I create two bitsets, one for the changing set and one for the fixed set. Each element of a set is hashed to one bit (so e.g. for a long one bit in 1 to 64) and then combined for a set (mostly a Bloom-Bitset with k=1).
To check if there is a non empty intersection set I simply need to combine the two bitsets with bit-and-operation and check if the result is not 0.
The false-positive rate would be (I think, didn't do the math) worse, but it's good enough for my case.
Example:
[A,B,C] => 0000100010100000
[B,D,F] => 0100000010000100
---------------------- &
0000000010000000 != 0 => true
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With