I was able to insert 10 million values in about 15 minutes. I am happy with the speed. I have 2 questions:
1) Will I have any problem if I have 30 times more data? (approx. 300 million) I have an option of increasing RAM if necessary.
2) I have used default redis installation for this test purpose. Is there any config change recommended? Below is the log output...
[3112] 20 Oct 09:53:50.074 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 09:53:50.101 * Background saving started by pid 3389
[3389] 20 Oct 09:53:51.258 * DB saved on disk
[3389] 20 Oct 09:53:51.260 * RDB: 1 MB of memory used by copy-on-write
[3112] 20 Oct 09:53:51.301 * Background saving terminated with success
[3112] 20 Oct 09:54:52.101 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 09:54:52.127 * Background saving started by pid 3394
[3394] 20 Oct 09:54:53.348 * DB saved on disk
[3394] 20 Oct 09:54:53.351 * RDB: 1 MB of memory used by copy-on-write
[3112] 20 Oct 09:54:53.427 * Background saving terminated with success
[3112] 20 Oct 09:55:54.099 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 09:55:54.140 * Background saving started by pid 3399
[3399] 20 Oct 09:55:56.345 * DB saved on disk
[3399] 20 Oct 09:55:56.348 * RDB: 16 MB of memory used by copy-on-write
[3112] 20 Oct 09:55:56.440 * Background saving terminated with success
[3112] 20 Oct 09:56:57.044 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 09:56:57.115 * Background saving started by pid 3402
[3402] 20 Oct 09:56:59.955 * DB saved on disk
[3402] 20 Oct 09:56:59.961 * RDB: 158 MB of memory used by copy-on-write
[3112] 20 Oct 09:57:00.015 * Background saving terminated with success
[3112] 20 Oct 09:58:01.052 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 09:58:01.144 * Background saving started by pid 3404
[3404] 20 Oct 09:58:04.864 * DB saved on disk
[3404] 20 Oct 09:58:04.874 * RDB: 31 MB of memory used by copy-on-write
[3112] 20 Oct 09:58:04.944 * Background saving terminated with success
[3112] 20 Oct 09:59:05.044 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 09:59:05.137 * Background saving started by pid 3405
[3405] 20 Oct 09:59:10.468 * DB saved on disk
[3405] 20 Oct 09:59:10.480 * RDB: 32 MB of memory used by copy-on-write
[3112] 20 Oct 09:59:10.537 * Background saving terminated with success
[3112] 20 Oct 10:00:11.091 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 10:00:11.238 * Background saving started by pid 3406
[3406] 20 Oct 10:00:16.301 * DB saved on disk
[3406] 20 Oct 10:00:16.316 * RDB: 310 MB of memory used by copy-on-write
[3112] 20 Oct 10:00:16.438 * Background saving terminated with success
[3112] 20 Oct 10:01:17.091 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 10:01:17.251 * Background saving started by pid 3408
[3408] 20 Oct 10:01:23.907 * DB saved on disk
[3408] 20 Oct 10:01:23.925 * RDB: 63 MB of memory used by copy-on-write
[3112] 20 Oct 10:01:24.051 * Background saving terminated with success
[3112] 20 Oct 10:02:25.051 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 10:02:25.208 * Background saving started by pid 3409
[3409] 20 Oct 10:02:32.606 * DB saved on disk
[3409] 20 Oct 10:02:32.627 * RDB: 63 MB of memory used by copy-on-write
[3112] 20 Oct 10:02:32.708 * Background saving terminated with success
[3112] 20 Oct 10:03:33.008 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 10:03:33.182 * Background saving started by pid 3411
[3411] 20 Oct 10:03:41.532 * DB saved on disk
[3411] 20 Oct 10:03:41.555 * RDB: 63 MB of memory used by copy-on-write
[3112] 20 Oct 10:03:41.682 * Background saving terminated with success
[3112] 20 Oct 10:04:42.082 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 10:04:42.272 * Background saving started by pid 3413
[3413] 20 Oct 10:04:51.376 * DB saved on disk
[3413] 20 Oct 10:04:51.401 * RDB: 63 MB of memory used by copy-on-write
[3112] 20 Oct 10:04:51.472 * Background saving terminated with success
[3112] 20 Oct 10:05:52.072 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 10:05:52.279 * Background saving started by pid 3414
[3414] 20 Oct 10:06:02.434 * DB saved on disk
[3414] 20 Oct 10:06:02.461 * RDB: 64 MB of memory used by copy-on-write
[3112] 20 Oct 10:06:02.579 * Background saving terminated with success
[3112] 20 Oct 10:07:03.065 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 10:07:03.366 * Background saving started by pid 3416
[3416] 20 Oct 10:07:15.095 * DB saved on disk
[3416] 20 Oct 10:07:15.129 * RDB: 634 MB of memory used by copy-on-write
[3112] 20 Oct 10:07:15.266 * Background saving terminated with success
[3112] 20 Oct 10:08:16.013 * 10000 changes in 60 seconds. Saving...
[3112] 20 Oct 10:08:16.331 * Background saving started by pid 3420
[3420] 20 Oct 10:08:27.988 * DB saved on disk
[3420] 20 Oct 10:08:28.024 * RDB: 125 MB of memory used by copy-on-write
[3112] 20 Oct 10:08:28.131 * Background saving terminated with success
There are two ways to scale your Redis (cluster mode enabled) cluster; horizontal and vertical scaling. Horizontal scaling allows you to change the number of node groups (shards) in the replication group by adding or removing node groups (shards).
Redis is essentially a data structure server. It supports commands and doesn't support a query language, so there is no case of using ad-hoc queries. Data access paths have to be designed, and this results in a loss of flexibility.
There is no query language (only commands) and no support for a relational algebra. You cannot submit ad-hoc queries (like you can using SQL on a RDBMS). All data accesses should be anticipated by the developer, and proper data access paths must be designed. A lot of flexibility is lost.
Redis can handle up to 2^32 keys, and was tested in practice to handle at least 250 million keys per instance. Every hash, list, set, and sorted set, can hold 2^32 elements. In other words your limit is likely the available memory in your system.
If you plan to have 30 times more data, you can run the "INFO memory" command, check the value of used_memory_human, multiply by 30, take a safety margin, and see if it still fits in your server. The safety margin depends on the write throughput applied on the Redis instance. The more you write, the more copy-on-write overhead at dump time.
The rule is everything should fit in physical memory. Please note that if your server also runs some other software, you need to take their memory consumption in account. A server running Redis should never swap.
Regarding your second question, there are not many parameters to change to target a production environment. I generally just check the logging level (set it to notice) and the persistence options (i.e. where the dump file will be stored, etc ...).
You may want to deactivate the RDB dump while loading your data, and reactivate after. You can do this from redis-cli with:
# To deactivate RDB dump
config set save ""
# To reactivate RDB dump (1 dump every 2 hours)
config set save "7200 1"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With