If I have a set of tags (<100), and a set of objects (~25000), where each object has some sub-set of the tags, do you know of an existing data-structure that would allow for fast retrieval of those objects that satisfy some boolean function of the tags?
Addition/deletion of tags and objects need not be particularly fast, but selection of those objects with tags that satisfy the boolean function should be.
Now that I have written my question down, it looks as if I'm describing an in-memory database, but originally I was thinking of some binary tree like structure for the objects where, for each branch, taking the left/right branch would be equivalent to deciding on have/have-not some tag. But that would not allow don't-care tags? i am asking as I wondered if this has been done before and find it hard to google for data structures.
Here's a suggestion: Use a bit-array for each tag, with as many elements as there are objects; each index of which represents one object. The value at each index is 1 if that object has that tag.
Boolean functions on tags are then fast set operations on this bit-array. And the resulting bit-array gives you the documents that satisfy the criteria.
This not very efficient if the tags or objects keep changing frequently but is perhaps applicable for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With