I need to create an match-finder system for some data set, as follows:
There is a set of objects, each identified by a string ObjectID
.
Each object has exactly N properties Pi. Each property value is a string.
Database example for N = 3 (in real life N = 8).
ObjectID: P1 P2 P3 -------------------------------- APPLE: RED ROUND FRUIT ORANGE: ORANGE ROUND FRUIT CARROT: RED LONG VEGETABLE
The system must return sets of ObjectID
s, matching given query on object properties. In query user must specify all property values. Alternatively, for some or all properties in query user may specify a "wildcard" *
, meaning that any property value would match criteria.
Example queries:
P1 P2 P3 => Result ------------------------------------ * ROUND FRUIT => APPLE, ORANGE RED LONG VEGETABLE => CARROT RED * * => CARROT, APPLE
All of this is trivially done with SQL.
The question is: is there a neat way to do that with Redis?
Note that I'm interested in Redis-based solutions specifically, for self-education purposes; other DBs are offtopic for this particular question.
Update: Trivial solution with explicit ObjectID
lists for each Pi and application-side filtering does not look neat enough to me :-)
What you are trying to do here is an inverted index.
For each column, have it map to a "set". Then, you can intersect the sets to get the result.
So, APPLE: RED ROUND FRUIT
would map to the following inserts:
SADD p1:RED APPLE
SADD p2:ROUND APPLE
SADD p3:FRUIT APPLE
Then, let's say I want to query for * ROUND FRUIT
, I would do:
SINTER p2:ROUND p3:FRUIT
This command is taking the intersection of the items in the p2:ROUND
set and the p3:FRUIT
set. This will return all the items that are ROUND
and FRUIT
, not caring what p1
is.
Some other examples:
SMEMBERS p1:GREEN
SINTER p1:RED p2:ROUND p3:FRUIT
SUNION p1:RED p1:GREEN
My above answer is going to use some computation power because the intersection operation is O(N*M)
. Here is a way of doing it that is more memory intensive, but will have faster retrieval because it effectively precomputes the indexes.
For each combination of properties, make a key that stores a set:
So, APPLE: RED ROUND FRUIT
would map to the following inserts:
SADD RED:ROUND:FRUIT APPLE
SADD :ROUND:FRUIT APPLE
SADD RED::FRUIT APPLE
SADD RED:ROUND: APPLE
SADD RED:: APPLE
SADD :ROUND: APPLE
SADD ::FRUIT APPLE
SADD ::: APPLE
Then, to query, you simply access the respective key. For example, * ROUND FRUIT
would simply be
SMEMBERS :ROUND:FRUIT
Obviously, this doesn't scale well at all in terms of memory when you have many dimensions, but it will be extremely snappy to retrieve results.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With