Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding AWS ELB Latency

I'm keen to understand exactly what the ELB Latency Statistic provided by CloudWatch means.

According to the docs:

  • ELB Latency: "Measures the time elapsed in seconds after the request leaves the load balancer until the response is received."

http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/US_MonitoringLoadBalancerWithCW.html

What I'm not 100% clear on is whether or not the response gets buffered to the ELB before it gets transferred to the client?

Does the statement in the docs mean:

  • ELB Latency: "Measures the time elapsed in seconds after the request leaves the load balancer until the response is received [by the client]."

Or:

  • ELB Latency: "Measures the time elapsed in seconds after the request leaves the load balancer until the response is received [by the ELB]."

I want to understand whether or not a poor Maximum Latency CloudWatch metric could be explained by having a significant number of users on 3G connections, or, if it instead indicates an underlying problem with the app servers occasionally responding slowly.

like image 420
sungiant Avatar asked Sep 07 '14 17:09

sungiant


People also ask

What is latency in ELB?

Latency is the measure of time between the receipt of a request to the load balancer and the return of a response. High latency can result from different factors, including network connectivity, the configuration of the ELB, and resource-related constraints on the backend application servers.

What is meant by latency in AWS?

Latency has an impact on the performance of a system, high levels of latency often manifest as poor customer experience or a reduction in system efficiency. Typically latency is a measurement of a round-trip between two systems such as how long it takes data to make its way between two.

Which ELB is the best for low latency TCP connections?

For network/transport protocols (layer4 – TCP, UDP) load balancing, and for extreme performance/low latency applications we recommend using Network Load Balancer. If your application is built within the Amazon Elastic Compute Cloud (Amazon EC2) Classic network, you should use Classic Load Balancer.


2 Answers

According to AWS support:

As the ELB (when configured with HTTP listeners) acts as a proxy (request headers comes in and gets validated, and then sent to the backend) the latency metric will start ticking as soon as the headers are sent to the backend until the backend sends the first byte responses.

In case of POSTs (or any HTTP methods when the customer is sending additional data) the latency will be ticking even when the customer is uploading the data (as the backend needs the complete request to send a response) and will stop once the backend send out the first byte response. So if you have a slow client sending data, the latency will take into account the upload time + the time the backend took to respond.

like image 100
sungiant Avatar answered Oct 27 '22 22:10

sungiant


It appears to be a measurement of how long the server is taking to generate its response from the ELB's perspective, without regard to how long might be needed for ELB to return the response to the client.

I came to this conclusion by reviewing my own logs in one of my applications, which uses ELB in front of another load balancer, HAProxy, which in turn is in front of the actual application servers. (This may seem redundant, but it gives us several advantages over using only ELB or only HAProxy.)

Here's the setup I'm referring to:

ELB -->>-- EC2+HAProxy -->>-- EC2+Nginx (multipe instances)

HAProxy logs several time metrics on each request, including one called Tr.

Tr: server response time (HTTP mode only). It's the time elapsed between the moment the TCP connection was established to the server and the moment the server sent its complete response headers. It purely shows its request processing time, without the network overhead due to the data transmission.

Now, stick with me for an explanation of why so much discussion of what HAProxy is doing here is relevant to ELB and the Latency metric.

Even though HAProxy logs a number of other timers related to how long the proxy spends waiting for various events on each request/response, this Tr timer is the single timer in my HAProxy logs that neatly corresponds to the values logged by Cloudwatch's "Latency" metric for the ELB on a minute-by-minute basis, give or take a millisecond or two... the others are wildly variant... so I would suggest that this ELB metric is similarly logging the response time of your application server, unrelated to the additional time that might be required to deliver the response back to the client.

It seems very unlikely for the HAProxy and the ELB to be so consistently in agreement, otherwise, given HAProxy's definition of the timer in question, unless ELB's timer is measuring something very similar to what HAProxy is measuring, since these systems are literally measuring the performance of the same exact app servers on the same exact requests.

If your application server doesn't benchmark itself and log timers of its own performance, you may want to consider adding them, since (according to my observations) high values for the Latency metric do seem to suggest that your application may be having a responsiveness issue that is unrelated to client connection quality.

like image 34
Michael - sqlbot Avatar answered Oct 27 '22 21:10

Michael - sqlbot