In practical terms, Redis databases should be used to separate different keys belonging to the same application (if needed), and not to use a single Redis instance for multiple unrelated applications. When using Redis Cluster, the SELECT command cannot be used, since Redis Cluster only supports database zero.
Redis comes with support for multiple databases, which is very similar to the concept in SQL databases. In SQL databases, such as MySQL, PostgreSQL, and Oracle, you can define a name for your databases. However, Redis databases are represented by numbers.
Managing Databases. Out of the box, a Redis instance supports 16 logical databases. These databases are effectively siloed off from one another, and when you run a command in one database it doesn't affect any of the data stored in other databases in your Redis instance.
Redis is actually a very reliable engine to store data, as long as you keep in mind its design principles. (the Aphyr post is great with that). The replication and cluster modes can add additional complexity (but of course mysql master-slave or master-master have their own risks and complexity as well.)
In principal, Redis databases on the same instance are no different than schemas in RDBMS database instances.
So, with all of that said, why/when would I ever want to use multiple Redis databases instead of just spinning up an extra instance of Redis for each extra database I want?
There's one clear advantage of using redis databases in the same redis instance, and that's management. If you spin up a separate instance for each application, and let's say you've got 3 apps, that's 3 separate redis instances, each of which will likely need a slave for HA in production, so that's 6 total instances. From a management standpoint, this gets messy real quick because you need to monitor all of them, do upgrades/patches, etc. If you don't plan on overloading redis with high I/O, a single instance with a slave is simpler and easier to manage provided it meets your SLA.
You don't want to use multiple databases in a single redis instance. As you noted, multiple instances lets you take advantage of multiple cores. If you use database selection you will have to refactor when upgrading. Monitoring and managing multiple instances is not difficult nor painful.
Indeed, you would get far better metrics on each db by segregation based on instance. Each instance would have stats reflecting that segment of data, which can allow for better tuning and more responsive and accurate monitoring. Use a recent version and separate your data by instance.
As Jonaton said, don't use the keys command. You'll find far better performance if you simply create a key index. Whenever adding a key, add the key name to a set. The keys command is not terribly useful once you scale up since it will take significant time to return.
Let the access pattern determine how to structure your data rather than store it the way you think works and then working around how to access and mince it later. You will see far better performance and find the data consuming code often is much cleaner and simpler.
Regarding single threaded, consider that redis is designed for speed and atomicity. Sure actions modifying data in one db need not wait on another db, but what if that action is saving to the dump file, or processing transactions on slaves? At that point you start getting into the weeds of concurrency programming.
By using multiple instances you turn multi threading complexity into a simpler message passing style system.
Even Salvatore Sanfilippo (creator of Redis) thinks it's a bad idea to use multiple DBs in Redis. See his comment here:
https://groups.google.com/d/topic/redis-db/vS5wX8X4Cjg/discussion
I understand how this can be useful, but unfortunately I consider Redis multiple database errors my worst decision in Redis design at all... without any kind of real gain, it makes the internals a lot more complex. The reality is that databases don't scale well for a number of reason, like active expire of keys and VM. If the DB selection can be performed with a string I can see this feature being used as a scalable O(1) dictionary layer, that instead it is not.
With DB numbers, with a default of a few DBs, we are communication better what this feature is and how can be used I think. I hope that at some point we can drop the multiple DBs support at all, but I think it is probably too late as there is a number of people relying on this feature for their work.
I don't really know any benefits of having multiple databases on a single instance. I guess it's useful if multiple services use the same database server(s), so you can avoid key collisions.
I would not recommend building around using the KEYS
command, since it's O(n) and that doesn't scale well. What are you using it for that you can accomplish in another way? Maybe redis isn't the best match for you if functionality like KEYS
is vital.
I think they mention the benefits of a single threaded server in their FAQ, but the main thing is simplicity - you don't have to bother with concurrency in any real way. Every action is blocking, so no two things can alter the database at the same time. Ideally you would have one (or more) instances per core of each server, and use a consistent hashing algorithm (or a proxy) to divide the keys among them. Of course, you'll loose some functionality - piping will only work for things on the same server, sorts become harder etc.
I am using redis for implementing a blacklist of email addresses , and i have different TTL values for different levels of blacklisting , so having different DBs on same instance helps me a lot .
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With