Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Http Client and Load Balancers

After spending a few hours reading the Http Client documentation and source code I have decided that I should definitely ask for help here.

I have a load balancer server using a round-robin algorithm somewhat like this

+---> RESTServer1 client --> load balancer +---> RESTServer2 +---> RESTServer3

Our client is using HttpClient to direct requests to our load balancer server, which in turn round-robins the requests to the corresponding RESTServer.

Now, Apache HttpClient creates, by default, a pool of connections (2 per route by default). This connections are by default persistent connections since I am using Http v1.1 and my servers are emitting Connection: Keep-Alive headers.

So, the problems is that since HttpClient creates this persistent connections, then those connections are no longer subject to round-robing algorithm at the balancer level. They always hit the same server every time.

This creates two problems:

  1. I can see that sometimes one or more of the balanced servers are overloaded with traffic, whereas one ore more of the other servers are idle; and
  2. even if I take one of my REST servers out of the balancer, it stills receives requests while the persistent connections are alive.

Definitely this is not the intended behavior.

I suppose I could force a Connection: close header in my responses, or I could run HttpClient without a connection pool or with a NoConnectionReuseStrategy. But the documentation for HttpClient states that the idea behind the use of a pool is to improve performance by avoiding having to open a socket every time and doing all the TPC handshaking and related stuff. So, I have to conclude that the use of a connection pool is beneficial to the performance of my applications.

So my question here, is there a way to use persistent connections with a load-balancer in the way or am I forced to use non-persistent connections for this scenario?

I want the performance that comes with reusing connections, but I want them properly load-balanced. Any thoughts on how I can configure this scenario with Apache Http Client if at all possible?

like image 266
Edwin Dalorzo Avatar asked Mar 21 '15 16:03

Edwin Dalorzo


People also ask

Can Apache be a load balancer?

Apache load balancer is open source and provides a server application traffic distribution solution. According to recent statistics, it has been utilized in over 100,000 websites.

What is Apache HTTP client?

Http client is a transfer library. It resides on the client side, sends and receives Http messages. It provides up to date, feature-rich, and an efficient implementation which meets the recent Http standards.

What is HTTP load balancer?

External HTTP(S) Load Balancing is a proxy-based Layer 7 load balancer that enables you to run and scale your services behind a single external IP address.


2 Answers

Your question is perhaps more related to your load balancer configuration and the style of load balancing. There are several ways:

  1. HTTP Redirection
  2. LB acts as a reverse proxy
  3. Pure packet forwarding

In scenarios 1 and 3 you do not have a chance with persistent connections. If your load balancer acts like a reverse proxy, there might be a way to achieve persistent connections with balancing. "Dumb" balancers, like SMTP or LDAP selects the target per TCP connection, not on a request basis.

For example the Apache HTTPd server with the balancer module (see http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html) can dispatch every request (even on persistent connections) to a different server.

Also check, that you do not receive a balancer cookie which might be session persistent so that the cause is not the persistent connection but a balancer cookie.

HTH, Mark

like image 159
mp911de Avatar answered Sep 20 '22 00:09

mp911de


+1 to @mp911de answer

One can also make the scenarios 1 and 3 work reasonably well by limiting the total time to live of persistent connections to some short period time, say 15 seconds. This way connections would live long enough to get re-used during periods of activity and short enough to go away during periods of relative inactivity.

like image 40
ok2c Avatar answered Sep 17 '22 00:09

ok2c