Limit TCP connections to target behind AWS Application Load Balancer

Tags:

elastic-load-balancer

I have an application/target behind AWS ALB and would like to place a hard cap on the number of TCP connections it will receive.

If I understand correctly, an ALB target can be either

Healthy -- ALB will route traffic to the target.

Unhealthy -- ALB will not route traffic to the target. Furthermore will drain/deregister/restart the target as soon as it can (I couldn't find this in the docs but this is the behavior I've observed).

Ideally I would put the target into a third state that says "Don't kill me but don't route traffic to me either" when the connection cap is reached (whereupon I would spawn more targets to meet demand).

There isn't such a third state but is there another way to place a cap on the number of connections?

281

asked Oct 24 '16 23:10

Chris Evans

1 Answers

There is one main misconception on the question itself, so I'll address that first

an ALB target can be [...] Unhealthy -- ALB will not route traffic to the target. Furthermore will drain/deregister/restart the target as soon as it can (I couldn't find this in the docs but this is the behavior I've observed).

That's not what really is going on.

And ALB is a Load Balancer: it will route requests to targets, according to some routing logic that you can configure to a certain extent.

It will also perform health checks, that will be used to determine, from the ALB's perspective, whether the target is healthy or unhealthy.

Here's the misconception: the only thing that the ALB will do when a target is deemed unhealthy is that it will stop sending new requests to it. That's all.

The ALB itself doesn't have the ability to (1) deregister or (2) restart the target. In fact, on its own, it will keep performing health checks and whenever the target becomes healthy again, it will start sending traffic again.

The behavior you describe you observed is also probably not exactly what happened. You said the target was deregistered and restarted. Unless you have something incredibly custom (highly unlikely), the targets weren't restarted, but they were replaced. This is a huge difference.

Let's assume that's the behavior that was actually happening.

The reason it's happening is almost certainly that there's an AutoScaling Group integrated with the ALB (it's one of the most common designs on AWS). The AutoScaling Group can integrate health checks with the ALB (i.e., the ALB report of target health is used by ASG). When the ASG determines that an instance is unhealthy (e.g., via that integration with ALB), then the ASG proceeds to replace it, so that it maintains a number of instances in a healthy state (equal to DesiredCapacity).

Now, back to the problem — in short, there's no way at the ALB-level for you to put a hard cap in the number of connections a target will receive.

In practical terms, you need to (1) prevent that situation of saturation from happening on the first place, and (2) decide what to do when it happens.

To prevent it from happening, you need to ensure you always have enough instances to handle the current amount of traffic, as well as the projected traffic increase between when you can detect it increasing and until new instances can be launched and put in service. For example, you could use an Alarm based on the average number of connections on each target and have that trigger AutoScaling (and before going that route, it would be important to make sure that "number of connections" is really the best scaling metric). Check how fast you can put new instances in service, and check how much over-provisioning you need to maintain so that you have enough time between when you detect increase in load to when the new instances are ready.

What to do when it happens? You have mainly two general choices here:

your target can accept and process the request in a possibly "degraded" situation (i.e., you're processing more requests than your spec, so they all might get slower, or they may fail due to downstream issues, etc);
or you can quickly reject that request (but double check it isn't an ALB request! you should keep accepting and processing those) and return an error message to the caller (a.k.a. load shedding).

In either case, you should decide whether you want to "wait it out", or start a process to add new instances to handle the additional load. This decision usually comes down to determining how likely the increase in traffic is to be persistent or just a temporary, short spike.

One thing you shouldn't do is mess up with health checks for that purpose. If you reject the health check requests from the ALB, it will categorize the instance as unhealthy and, if you have an ASG (you should), the ASG will kill the instance (leading to even more load on the remaining instances while this one is replaced). Additionally, a situation of "I'm healthy but saturated" would be indistinguishable to the ALB from "I'm really having issues and I need to be replaced".

As a final note: keep in mind that an ALB isn't really dealing with "connections", but rather "requests" (i.e., it's a higher level of abstraction). What this means is that "number of connections" might now be a good metric to scale based on, as the ALB can and most likely will multiplex requests from a lot of clients into a smaller number of connections to a target. That is, if the ALB receives TCP connections from 10 different clients, it may only open 5 (or whatever other number) of connections to a target, and send requests from all 10 clients through only those 5 connections.

117

answered Nov 03 '22 00:11

Bruno Reis

Related questions
                            
                                How to get the CloudWatch Agent and Metric Filters to Report Dimensions
                            
                                Searching against secured AWS ElasticSearch
                            
                                How can I set AWS credentials in AWS PowerShell
                            
                                How do I pass GET parameters to my AWS Lambda function when using an HTTP endpoint?
                            
                                Django POST listener in Elastic Beanstalk to receive AWS Worker Tier requests
                            
                                Is Mutual auth with aws api gateway possible?
                            
                                Sign API Gateway Request With AWS SDK
                            
                                AWS Elastic BeansTalk Django cronjob post request returning 403 error
                            
                                AWS SNS not sending Subscription Confirmation
                            
                                high cpu in redis 2.8 (elasticache) cache.r3.large
                            
                                java : Use Server-Side Encryption in Amazon S3 using vfs s3 plugin
                            
                                How to use AWS roles with Packer to create AMIs
                            
                                AttributeError: 'module' object has no attribute 'sslwrap'
                            
                                Amazon S3 hardcode my bucket URL
                            
                                Can not connect web socket with javascript
                            
                                AWS Lambda and MongoDB
                            
                                How to make AWS Lambda stop execution?
                            
                                Using aws credentials profiles with spark scala app
                            
                                How can I edit a phone number of a user in user pool of `AWS cognito` before the user is authenticated?
                            
                                Create api-gateway lambda integration using aws-cli

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With