Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redis and Querying Values

Tags:

redis

Redis is conceptually different from traditional SQL databases that I use, and I'm trying to figure out if it is right for my project... I've been looking around but can't seem to find an answer to my question.

I have a set of Users that I need to store, each with a unique ID and several values (such as their name) associated with it. It seems like I can simply store those as a hash:

user:fef982dcfe1a7bcba4849b4c281bba95
"username" "andrewm" "name" "Andrew"

I also have a bunch of messages I want to store, each having a few properties such as the sender and recipient:

message:1a7bcba4849b4c281bfef98a952dcfeb
"sender" "fef982dcfe1a7bcba4849b4c281bba95" "recipient" "82dcfe1a7bcba4849b4c281bba95fef9" "message" "Hi!"

My question is, how would I go about retrieving all of the messages that are sent by a specific user (designated by a their hash). Should I be using a traditional relational database instead, or even a NoSQL database like MongoDB (which I've used before)? If so, does anyone have any suggestions for high performance stores? I won't be doing any true searching (i.e. MySQL LIKE queries)-- just key value lookups, really.

like image 570
Andrew M Avatar asked Jul 01 '12 02:07

Andrew M


2 Answers

It is certainly possible to model these data with Redis, but you need to think in term of data structures AND access paths. With Redis the access paths are not managed implicitly (like with indexes in RDBMS/MongoDB).

For the provided example, you could have:

user:<user hash> -> hash of user properties
user:<user hash>:sent -> set of <msg hash>
user:<user hash>:received -> set of <msg hash>
message:<msg hash> -> hash of message properties

Adding/deleting a message would mean maintaining the *:sent and *:received sets corresponding to the senders and recipients, on top of adding/deleting the message object itself.

Retrieving sent or received messages for a given user is just a SMEMBERS command, or a SORT if you want to retrieve also the properties of the message at the same time:

# Get a list of message hash codes only in one roundtrip
smembers user:<user hash>:received

# Get a list of message contents in one roundtrip
sort user:<user hash>:received by nosort get message:*->sender get message:*->message

For the rationale about using sort, see:

  • Getting multiple key values from Redis
  • Need help conceptualizing in Redis/NoSQL

Note 1: with Redis it is better to use integers as keys rather than UUID or hash codes (especially in sets), since they are stored in a more efficient way.

Note 2: if you need to order the messages, then lists must be used instead of sets. The consequence is only oldest messages can be removed, and only newset messages can be added in an efficient way. You would probably to also add a global list for all messages.

like image 197
Didier Spezia Avatar answered Sep 28 '22 20:09

Didier Spezia


Redis's basic data types do not support multi-conditional queries, full-text search, etc. Therefore, we have modified the Redis source code and transformed Redis into a database that can be used like SQL data through auxiliary indexes.

Ths project homepage is https://oncedb.com

OnceDB does not change the data storage structure of Redis. Redis database files can be directly operated in OnceDB and then returned to Redis for use.

Index search

Create index

The performance of full-text search is poor. You can improve the performance by creating indexes. The method is to create ordered lists for the indexed fields, and then perform an intersection query operation on these ordered lists when the conditional query is performed.

# Create hash data
hmset article:001 poster dota visit 21 key js
hmset article:002 poster dota visit 11 key c
hmset article:003 poster like visit 34 key js
hmset article:004 poster like visit 44 key c

Then we create indexes for the above fields, the weight score is set to: 202000201, an integer about time, the value is the ID value of article

# Create indexed
zadd *article.poster:dota 20200201 001 20200201 002
zadd *article.poster:like 20200201 003 20200201 004
zadd *article.key:js 20200201 001 20200201 003
zadd *article.key:c 20200201 002 20200201 004
# "visit" using its value as the weight score
zadd *article.visit 21 001 11 002 34 003 44 004

Query by index

Find the intersection of the two indexes *article.key:js and *article.poster:dota and store them in the *tmp1 ordered list:

zinterstore *tmp1 2 *article.key:js *article.poster:dota
> 1

Then *tmp1 stores the ID set that meets the key = js and poster = dota conditions:

zrange *tmp1 0 -1
> 001

You can use the zrangehmget command to print the corresponding HASH value:

zrangehmget *tmp1 0 -1 article: key poster
1) 001
2) 40400402
3) js
4) dota
5) 
6) 

The result is the same as the direct full-text search key = js and poster = dota

hsearch article:* key = js poster = dota
1) article:001
2) js
3) dota

Search range

For example, to search for data with the number of visits between 20 and 30, and key = js, you can achieve it by controlling the weight

Create a temporary index, only take the weight of *article.visit and the data of key = js

zinterstore *tmp2 2 *article.key:js *article.visit weights 0 1
> 2

Take data between 20 and 30

zrangebyscore *tmp2 20 30
> 001

You can use zrangehmgetbyscore to print the corresponding hash data:

zrangehmgetbyscore *tmp2 20 30 article: key visit
1) 001
2) 21
3) js
4) 21
5) 
6) 

The results are consistent with those using full-text search:

hsearch article:* visit >= 20 visit <= 30 key = js
1) article:001
2) 21
3) 
4) js

Because there are two identical fields, visit> = 20 visit <= 30, the search results will only output one, and the repeated fields in the third row will output empty.

More OnceDB extended instructions can be viewed: Search, query, calculate, and sum instructions in OnceDB

Automatic indexing

The creation and maintenance of Redis indexes is not very convenient. OnceDB can choose to automatically create auxiliary indexes when data is modified.

Create index:upsert schema field operator value ...

Use upsert / insert / update directives and special operators to create indexes automatically:

The example above can be written as:

upsert article id @ 001 poster ? dota visit / 21 key ? js
upsert article id @ 002 poster ? dota visit / 11 key ? c
upsert article id @ 003 poster ? like visit / 34 key ? js
upsert article id @ 004 poster ? like visit / 44 key ? c

Operator:

@: Primary key ?: Group index /: Sort index

The indexes are automatically created after the operation: *article *article.poster:dota *article.poster:like *article.visit *article.key:js *article.key:c

Multi-condition index query:find schema from to field operator value ...

For fields with indexes, you can use the find command to query through the index fields. For example, query: data of key = js and poster = dota. You can use "?" To indicate that these two fields are grouped indexes:

find article 0 -1 key ? js poster ? dota
1) 1
2) article:001
3) js
4) dota

1 represents the total amount of data. If it is -1, it means that full-text search is used, and the performance is poor.

Index range query

You can add @ to specify an index range and use + to specify which index field to use for the score weight range.

find article 0@20 -1@30 key ? js visit /+ *
1) 1
2) article:001
3) js
4) 21

Delete index

OnceDB does not store index definitions. When deleting, you need to manually indicate which fields contain indexes. You need to specify the field names and index operators.

remove article @ 001 key ? poster ? visit /

You can also customize the index name and weight score. For more instructions, please see: OnceDB data modification and query help documentation

like image 44
Kris Zhang Avatar answered Sep 28 '22 19:09

Kris Zhang