Best EC2 setup for redis server [closed]

Tags:

We are deploying a large scale web application that uses only redis as a data store. I notice the the benchmark of our redis master is around 8000 transactions per second on EC2, far less than the stated benchmarks on dedicated hardware.

I understand that there is a performance penalty for running Redis on a virtual machine like EC2, but I would love some pointers from people who have deployed Redis in production environments on EC2 on what EC2 setup you have found most effective for getting more out of redis.

Thanks.

535

asked Aug 01 '12 18:08

andrewl

1 Answers

EC2 is probably not the best environment to run Redis on virtualized hardware, but it is a popular one, and there are a number of points to know to get the best from Redis on this platform.

I'm one of the authors of http://redis.io/topics/benchmarks and http://redis.io/topics/latency which cover most of the topics I present below. This is just a summary of the main points.

Virtualization toll

It is not specific to EC2, but Redis is significantly slower when running on a VM (in term of maximum supported throughput). This is due to the fact for basic operations, Redis does not add much overhead to the epoll/read/write system calls required to handle client connections (like memcached, or other efficient key/value stores). System calls are typically more expensive on a VM, and they represent a significant part of Redis activity (especially in benchmarks). In that conditions, a 50% decrease in term of maximum throughput compared to bare metal is not uncommon.

Of course, it also depends on the quality of the hypervisor. For EC2, Xen is used.

Benchmarking in good conditions

Benchmarking can be tricky, especially on a platform like EC2. One point often forgotten is to ensure a proper configuration for both the benchmark client and server. For instance, do not run redis-benchmark on a CPU starved micro-instance (which will likely be throttled down by Amazon) while targeting your Redis server. Both machines are equally important to get a good maximum throughput.

Actually, to evaluate Redis performance, you need to:

run redis-benchmark locally (on the same machine than the server), assuming you have more than one vCPU core.
run redis-benchmark remotely (from a different VM), on a machine whose QoS configuration is equivalent to the server machine

So you can evaluate and compare performance of the machines and the network.

On EC2, you will have the best results with second generation M3 instances (or high-memory, or cluster compute instances) so you can benefit of HVM (hardware virtualization) instead of relying on slower para-virtualization.

The fork issue

This is not specific to EC2, but to Xen: forking a large process can be really slow on Xen (it looks better with kvm). For Redis this is a big problem if you plan to use persistence: both persistence options (RDB or AOF) require the main thread to fork and launch background save or rewrite processes.

In some cases, fork latency can freeze Redis event loop for several seconds. The more memory managed by the Redis instance, the more latency.

On EC2, be sure to use a HVM enabled instance (M3, high-memory, cluster), it will mitigate the issue.

Then, if you have large memory requirements, and your application can tolerate it, consider running several smaller Redis instances on the same machine, and shard your data. It can decrease the latency due to fork operations to an acceptable level.

Persistence configuration

This is a key point to get good performance from Redis (both on VM and bare metal). So please take the time to carefully read http://redis.io/topics/persistence

If you use RDB, keep in mind the memory copy-on-write mechanism will start duplicating pages once the save background process has been forked off. So you need to ensure there is enough memory for Redis itself, plus some margin to cope with the COW. the amount of extra memory depends on your workload. The more you write in the instance, the more extra memory you need.

Please note writing a file may also consume some memory (because of the filesystem cache), so during a Redis background save, you need to account for Redis memory, COW overhead, and size of the dump file.

The machine running the Redis server must never swap. If it does, the result will be catastrophic. Contrary to some other stores, Redis is not virtual memory friendly.

With Linux, be sure to set sensible system parameters: vm.overcommit_memory=1 and vm.swappiness=0 (or a very low value anyway). Do not use old kernel versions: they are quite bad at enforcing a low swappiness (resulting in swapping when a large file is written).

If you use AOF, review the fsync options. It is a tradeoff between raw performance and durability of the write operations. You need to make a choice and define a strategy.

You also need to get familiar with the EC2 storage options. On some VM, you have the choice between ephemeral storage and EBS. On some others, you only have EBS.

Ephemeral storage is generally faster, and you will probably get less issues than with EBS, but you can easily loose your data in case of disk failure or reboot of the host, etc ... You can imagine putting RDB snapshots on ephemeral storage, and then copying the resulting files to EBS directories, as a tradeoff between performance and robustness.

EBS is remote storage: it may eat the standard network bandwidth allocated to the VM, and impact the maximum throughput of Redis. If you plan to use EBS, consider selecting the "EBS-optimized" option to establish a QoS between the standard network and storage links.

Finally, a very common setup for performance demanding instances with EC2 is to deactivate persistence on the master, and only activate it on a slave instance. It is probably less safe for the data, but it may prevent a lot of potential latency issues on the master.

182

answered Oct 16 '22 09:10

Didier Spezia

Related questions
                            
                                EC2: How to Clone Git Repository
                            
                                RabbitMQ settings disappear on restart. Why?
                            
                                ssh with private pem key not possible (dlopen image not found) [closed]
                            
                                Is it possible to move EC2 volumes to Amazon Glacier without having to download and upload it?
                            
                                Run Java EE app on EC2
                            
                                Spark - Which instance type is preferred for AWS EMR cluster? [closed]
                            
                                Connect Eclipse RSE with remote Linux server using public key attained from Amazon ec2
                            
                                Xlib: extension "XInputExtension" missing on display ":1" Atom Ubuntu
                            
                                AWS S3 Bucket Access from EC2
                            
                                Lambda cold start possible solution?
                            
                                AWS documentation for: EC2 "Launch more like this"
                            
                                Can I add dns name in aws security group [closed]
                            
                                How to replace root ebs volume with another root ebs volume? [closed]
                            
                                EC2 API Throttling Limits
                            
                                How to make a HTTP call reaching all instances behind amazon AWS load balancer?
                            
                                AWS load balancer and maintenance page
                            
                                Amazon Ec2 FTP Write Permission [closed]
                            
                                Error registering: NoCredentialProviders: no valid providers in chain ECS agent error
                            
                                InvalidInstanceId: An error occurred (InvalidInstanceId) when calling the SendCommand operation
                            
                                Stop ECS cluster temporarily

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Best EC2 setup for redis server [closed]

Tags:

redis

amazon-ec2

andrewl

People also ask

1 Answers

Didier Spezia

Recent Activity

Donate For Us