Considering a situation in which we have a web-application which is deployed in multiple servers and client requests landing to a load balancer, which in turn routes requests to actual server.
Now, if we have too many requests coming concurrently, would the load balancer itself fail? Suppose we get 1 million requests per second, won't that be beyond the processing capacity of a single load balancer?
How do we design (at least conceptually) a system which handles situations like this?
An application load balancer is one of the features of elastic load balancing and allows simpler configuration for developers to route incoming end-user traffic to applications based in the public cloud. In addition to load balancing's essential functionality, it also ensures no single server bears too much demand.
Load balancers are used to increase capacity (concurrent users) and reliability of applications. They improve the overall performance of applications by decreasing the burden on servers associated with managing and maintaining application and network sessions, as well as by performing application-specific tasks.
How does a load balancer work? A load balancer is a reverse proxy. It presents a virtual IP address (VIP) representing the application to the client. The client connects to the VIP and the load balancer makes a determination through its algorithms to send the connection to a specific application instance on a server.
If the load balancer does not exist or has already been deleted, the call succeeds. Deleting a load balancer does not affect its registered targets. For example, your EC2 instances continue to run and are still registered to their target groups. If you no longer need these EC2 instances, you can stop or terminate them.
Putting a load balancer in front of your load balancer will not solve the problem simply because if one load balancer would failover due to the high traffic, so would the one in front!
You can achieve what you're looking for with DNS. You can register multiple IP address to a domain name and hence have multiple load balancers.
Let's say you're making a request to www.example.com. Your browser will lookup the record in the DNS and receive a list of corresponding IP addresses. Then the request will go to the first address on the list. If it's unavailable, it will go to the next on the list. The DNS servers will randomize order of the list to spread the load, and even do periodic health checks to remove unresponsive IPs. That means your requests will be split among your load balancers instead of hitting just the one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With