Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redis SELECT performance

Tags:

redis

I am using redis with multiple databases (which I switch by the SELECT command).

I am storing different types of information into redis and I needed to seperate it somehow. I didn't like to prefix the keys to distniguish the information type so I created more databases.

I would like to ask if it was a right decission, with concern for performance ?

Also how much overhead does SELECT cause ? If I need to traverse some related data from say two databases, which aproach is better (see pseudo code) ?

for data in array {
  redis_select(0)
  k = redis_get(...)
  redis_select(1)
  k2 = redis_get(k)
}

or

redis_select(0)
k = []
for data in array {
  k[x] = redis_get(...)
}

redis_select(1)
k2 = []
for data in array {
  k2[x] = redis_get(k[x])
}
like image 452
Jan Koriťák Avatar asked Jan 10 '12 14:01

Jan Koriťák


1 Answers

You can use the Redis database concept to separate data. This is fully supported in the current version, and will still be supported in the future ones.

Now, this is not the recommended solution to isolate data. It is better to run several Redis instances instead. The overhead of an instance is very low (less than 1 MB), so you can start several of them on any box. It is more scalable (the workload will be distributed on several CPU cores instead of just one). It is more flexible (you may want to use different configuration parameters per data set, or different dump files). Your client just has to open one connection per instance to access the various data sets.

Now if you still want to use Redis databases, and are concerned with performance, you need to evaluate the number of extra roundtrips they represent. With in-memory databases such as Redis, the cost of all basic operations is almost the same, because it is dominated by communication and protocol management, not by the execution itself. So when the keys/values are small, GET, SET, SELECT commands tend to have the same cost. Each time a SELECT is executed, it is just like if an extra GET or SET command is executed.

Taking your examples, the first proposal will generate 4 commands per item of the array. The second proposal will generate only 2 commands per item, so it is much more efficient. If the number of items is significant, the cost of SELECT is negligible in the second proposal, while it is not in the first one.

If you plan to iterate on arrays to run Redis commands, consider using variadic parameters commands (such as MGET/MSET) or pipelining (if your client support it), in order to reduce the general number of roundtrips.

like image 185
Didier Spezia Avatar answered Oct 17 '22 13:10

Didier Spezia