I am reading about load balancing.
I understand the idea that load balancers transfer the load among several slave servers of any given app. However very few literature that I can find talks about what happens when the load balancers themselves start struggling with the huge amount of requests, to the point that the "simple" task of load balancing (distribute requests among slaves) becomes an impossible undertaking.
Take for example this picture where you see 3 Load Balancers (LB) and some slave servers.
Figure 1: Clients know one IP to which they connect, one load balancer is behind that IP and will have to handle all those requests, thus that first load balancer is the bottleneck (and the internet connection).
What happens when the first load balancer starts struggling? If I add a new load balancer to side with the first one, I must add even another one so that the clients only need to know one IP. So the dilema continues: I still have only one load balancer receiving all my requests...!
Figure 2: I added one load balancer, but for having clients to know just one IP I had to add another one to centralize the incoming connections, thus ending up with the same bottleneck.
Moreover, my internet connection will also reach its limit of clients it can handle so I probably will want to have my load balancers in remote places to avoid flooding my internet connection. However if I distribute my load balancers, and want to keep my clients knowing just one single IP they have to connect, I still need to have one central load balancer behind that IP carrying all the traffic once again...
How do real world companies like Google and Facebook handle these issues? Can this be done without giving the clients multiple IPs and expect them to choose one at random avoiding every client to connect to the same load balancer, thus flooding us?
It is possible for them to be overwhelmed, but typically that requires a load that would saturate most connections.
Bottlenecks. As scale increases, load balancers can themselves become a bottleneck or single point of failure, so multiple load balancers must be used to guarantee availability. DNS round robin can be used to balance traffic across different load balancers.
The main purpose of load balancing is to prevent any single server from getting overloaded and possibly breaking down. In other words, load balancing improves service availability and helps prevent downtimes.
Failover and load balancing are vital for Oracle Access Manager availability and performance. Load balancing distributes request processing across multiple servers. Failover redirects requests to alternate servers if the originally requested server is unavailable or too slow.
If a server or group of servers is performing slowly, the load balancer distributes less traffic to it. If a server or group of servers fails completely, the load balancer reroutes traffic to another group of servers, a process known as "failover." What is failover?
A server load balancer is based on TCP/IP or DNS approach and distributes high volume sites to several servers using network-based hardware or software-defined appliances. When this technique works across several geo-locations, it’s called a global server load balancer.
On the Internet, load balancing is often employed to divide network traffic among several servers. This reduces the strain on each server and makes the servers more efficient, speeding up performance and reducing latency. Load balancing is essential for most Internet applications to function properly.
Software load balancers with cloud offload provide efficient and cost-effective protection. There is a variety of load balancing methods, which use different algorithms best suited for a particular situation. Least Connection Method — directs traffic to the server with the fewest active connections.
Your question doesn't sound AWS specific, so here's a generic answer (elastic LB in AWS auto-scales depending on traffic):
You're right, you can overwhelm a loadbalancer with the number of requests coming in. If you deploy a LB on a standard build machine, you're likely to first exhaust/overload the network stack including max number of open connections and handling rate of incoming connections.
As a first step, you would fine tune the network stack of your LB machine. If that still does not provide you the required throughput, there are special purpose loadbalancer appliances on the market, that are built ground-up and highly optimized to handle a large number of incoming connections and routing them to several servers. Examples of these are F5 and netscaler
You can also design your application in ways that help you split traffic to different sub domains, thereby reducing the number of requests 1 LB has to handle.
It is also possible to implement a round-robin DNS, where you would have 1 DNS entry point to several client facing LBs instead of just one as you've depicted.
Advanced load balancers like Netscaler and similar also does GSLB with DNS not simple DNS-RR (to explain further scaling)
if you are to connect to i.e service.domain.com, you let the load balancers become Authorative DNS for the zone and you add all the load balancers as valid name servers.
When a client looks up "service.domain.com" any of your loadbalancers will answer the DNS request and reply with the IP of the correct data center for your client. You can then further make the loadbalancer reply on the DNS request based of geo location of your client, latency between clients dns server and netscaler, or you can answer based on the different data centers load.
In each datacenter you typically set up one node or several nodes in cluster. You can scale quite high using such a design.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With