This question has been asked on the AWS forums without any responses. Below is the original question
Hi!
We are doing rolling upgrades of our API-instances behind an ELB and are seeing alarmingly long times when waiting for the connection draining to finish. The scenario is as follows:
We're running two identical systems, 4x c3.large behind an ELB, one system for dev and one system for production. The only difference between the two systems is that the production system continuously serves requests.
A rolling upgrade on the dev system takes about 3 minutes for all 4 instances when there is no traffic. On the production system these times fluctuate between 6 and 17+ minutes. For reasons we need to do these rolling upgrades on average about 2 times per hour and then 17+ minutes for a rolling upgrade is starting to become a problem.
All our API calls are < 100ms so there is no long running requests that should hold the connection draining back for that long. We have played around with changing the values for both idle timout and connection draining timout on the ELB with no good results.
When lowering the connection draining timeout we're seeing 502 responses from the API since it forceably drops the connections and lowering the idle timeout seems to have no effect.
All in all, we would like to know what can be done to reduce these times. As our requests all are < 100ms it should in theory not take more than a second or two to drain the connections from an instance. Is there something we are missing here?
A last note: We tried turning off connection draining all together and this seemed to work better than lowering the connection draining timout. On average there was only 1 or 2 errors per test run and some runs had no errors. Is this because the response times are so fast? Our responses are also relatively small so it might be possible that the TCP response is saved in the OS output buffer so it can respond even if connection draining is turned off? What is the difference between having connection draining timeout set to 0 and turned off?
Additional info:
Thanks!
AWS ELB connection draining prevents breaking open network connections while taking an instance out of service, updating its software, or replacing it with a fresh instance that contains updated software.
When you enable connection draining, you can specify a maximum time for the load balancer to keep connections alive before reporting the instance as de-registered. The maximum timeout value can be set between 1 and 3,600 seconds (the default is 300 seconds).
Connection draining is a process that ensures that existing, in-progress requests are given time to complete when a VM is removed from an instance group or when an endpoint is removed from a zonal network endpoint group (NEG).
60,000 active flows (or connections) (sampled per minute). 1 GB per hour for EC2 instances, containers and IP addresses as targets.
This is a complex question with a number of variables and so I can make a few suggestions to look into.
1) Check your Health Check Interval, Response Timeout, and Unhealthy Threshold settings. If, as part of your rolling upgrade you terminate your instances while the ELB is still performing health checks, the ELB is going to wait the duration of "Response Timeout" irrespective of connection draining. If that timeout is set for 1 minute with 3 retries ("Unhealthy Threshold") that is 3 minutes per server before the ELB declares the instance dead. So, even with connection draining set to zero, no new requests will go to that instance but the ELB will be waiting for 3 minutes until it decides the instance is actually dead.
Worst case - multiply by 4 instances and you're at 12 minutes before the ELB understands all instances are dead. In other words - the ELB is busy waiting for healthchecks to actually fail.
2) Are you unregistering your instances from the ELB prior to terminating them? This avoids the issue in #1 above.
3) Disabling Connection Draining and Enabling Connection Draining with a Timeout value of zero should provide the equivalent functionality
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With