I've been investigating Redis functional capabilities in compare to relational database, without getting into NFR issues such as response time, scalability etc. in which I understand that Redis excel.
Here is for example a list of use-cases that Redis can handle for web-applications.
Having mention so, one known disadvantage of Redis is for doing business analytic, but how complex the analytic should be in order to make Redis less efficient in compare for example to MySQL?
For example if the following data structure in MySQL:
Table: User Columns: Id(PK), Name(VarChar), Age(Int)
Table: Message Columns: UserID(FK), Content(VarChar), Importance(Int)
and in my application I want to use the following 2 queries:
1. SELECT Content FROM Message WHERE Importance > 2;
2. SELECT Content FROM Message,Users WHERE User.Id=Message.UserID and
User.Age > 30;
My Question:
Can I use Redis to store the Datastructure above and query it in the same (or more) efficiency as in MySQL?
Short answer: yes.
Long answer: Redis is an amazing piece of technology but it is not a relational database. NoSQLs, with Redis included, are built on the premise that data needs to be stored according to the access patterns used with it. Therefore, to accomplish the above you'll first have to store the data "correctly".
To store your tables' rows, it appears that you'll want to use the Hash data structure. In Redis' terminology, here's how you'd create a User key for UserID 123:
HMSET user:123 id 123 name foo age 31
Note 1: the use of a colon (':') in constructing the key's name is merely a convention.
Note 2: while the ID is already a part of the key's name, it is common to include it a field in the Hash for easier access.
Similarly, here's how you'll create a Message key (with the ID 987):
HMSET message:987 id 987 userid 123 content bar importance 3
Now comes the fun part :) Redis doesn't have FKs or indices, so you'll have to maintain data structures that will assist you in fetching the data per your requirements. For your first query, the best choice is keeping a Sorted Set in which the members are the message IDs and the scores are the importance. Therefore do:
ZADD messages_by_importance 3 987
Fetching messages' content with importance greater than 2 will be done with two operations as shown by this pseudo-Pythonic code:
messages = r.zrangebyscore('messages_by_importance', '(2', '+inf')
for msg in messages:
content = r.hget('message:' + msg, 'content')
do_something(content)
Note 3: this snippet is quite naive and can be optimized for better performance, but it should provide you with the basic gist.
For the second query, you'll first need to find users who are older than 30 year - again, the same Sorted Set trick should be used:
ZADD users_by_age 31 123
ZRANGEBYSCORE users_by_age (30 +inf
This will get you the list of all users that match your criterion, but you'll also need to keep track (index) of all messages per user. To do this, use a Set:
SADD user:123:messages 987
To tie everything, here's another pseudo-snippet:
users = r.zrangebyscore('users_by_age', '(30', '+inf')
for user in users:
messages = r.smembers('user:' + user + ':messages')
for msg in messages:
content = r.hget('message:' + msg, 'content')
do_something(content)
This should be enough to get you started but once you've got a firm grip on the basics, look into optimizing these flows. Easy gains can be gotten with the use of pipelining, Lua scripting and smarter indices according to your needs... and if you need any further assistance - just ask :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With