Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

High Redis latency in AWS (ElastiCache)

I'm trying to determine the cause of some high latency I'm seeing on my ElastiCache Redis node (cache.m3.medium). I gathered some data using the redis-cli latency test, running it from an EC2 instance in the same region/availability-zone as the ElastiCache node.

I see that the latency is quite good on average (~.5ms), but that there are some pretty high outliers. I don't believe that the outliers are due to network latency, as network ping tests between two EC2 instances don't exhibit these high spikes.

The Redis node is not under any load, and the metrics seem to look fine.

My questions are:

  1. What might be causing the high max latencies?
  2. Are these max latencies expected?
  3. What other steps/tests/tools would you use to further diagnose the issue?

.

user@my-ec2-instance:~/redis-3.2.8$ ./src/redis-cli -h redis-host --latency-history -i 1
min: 0, max: 12, avg: 0.45 (96 samples) -- 1.01 seconds range
min: 0, max: 1, avg: 0.33 (96 samples) -- 1.00 seconds range
min: 0, max: 3, avg: 0.33 (96 samples) -- 1.01 seconds range
min: 0, max: 2, avg: 0.29 (96 samples) -- 1.01 seconds range
min: 0, max: 2, avg: 0.26 (96 samples) -- 1.01 seconds range
min: 0, max: 1, avg: 0.34 (96 samples) -- 1.00 seconds range
min: 0, max: 4, avg: 0.34 (96 samples) -- 1.01 seconds range
min: 0, max: 1, avg: 0.26 (96 samples) -- 1.00 seconds range
min: 0, max: 5, avg: 0.33 (96 samples) -- 1.01 seconds range
min: 0, max: 1, avg: 0.31 (96 samples) -- 1.00 seconds range
min: 0, max: 1, avg: 0.33 (96 samples) -- 1.00 seconds range
min: 0, max: 1, avg: 0.28 (96 samples) -- 1.00 seconds range
min: 0, max: 1, avg: 0.30 (96 samples) -- 1.00 seconds range
min: 0, max: 4, avg: 0.35 (96 samples) -- 1.01 seconds range
min: 0, max: 15, avg: 0.52 (95 samples) -- 1.01 seconds range
min: 0, max: 4, avg: 0.48 (94 samples) -- 1.00 seconds range
min: 0, max: 2, avg: 0.54 (94 samples) -- 1.00 seconds range
min: 0, max: 1, avg: 0.38 (96 samples) -- 1.01 seconds range
min: 0, max: 8, avg: 0.55 (94 samples) -- 1.00 seconds range
like image 312
Chris McBride Avatar asked Apr 29 '17 17:04

Chris McBride


1 Answers

I ran tests with several different node types, and found that bigger nodes performed much better. I'm using the cache.m3.xlarge type, which has provided more consistent network latency.

like image 193
Chris McBride Avatar answered Nov 20 '22 09:11

Chris McBride