Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Looking for a message bus for HTTP load sharing

I have an HTTP application with standalone workers that perform well. The issue is that some times they need to purge and rebuild their caches, so they stop responding for up to 30 seconds.

I have looked into a number of load balancers, but none of them seem to address this issue. I have tried Perlbal and some Apache modules (like fcgid) and they happily send requests to workers that are busy rebuilding their cache.

So my take is this: isn't there some kind of message bus solution where all http requests are queued up, leaving it up to the workers to process messages when they are able to?

Or - alternatively - a load balancer that can take into account that the workers are some times unable to respond.

Added later: I am aware that a strategy could be that the workers could use a management protocol to inform the load balancer when they are busy, but that solution seems kludgy and I worry that there will be some edge cases that results in spurious errors.

like image 780
mzedeler Avatar asked Mar 23 '13 13:03

mzedeler


2 Answers

If you use Amazon Web Services Load Balancer you can achieve your desired result. You can mark an EC2 Instance behind an Elastic Load Balancer (ELB) as unhealthy while it does this cache purge and rebuild.

What I would do is create an additional endpoint for each instance, that is called rebuild_cache for example. So if you have 5 instances behind your ELB, you can make a script to hit each individual instance (not through the load balancer) on that rebuild_cache endpoint. This endpoint would do 3 things:

  1. Mark the instance as unhealthy. The load balancer will realize it's unhealthy after a failed health check (the timing and threshold of health checks are configurable from AWS Web Console).
  2. Run your cache purge and rebuild
  3. Mark the instance as healthy. The load balancer will run a health check on the instance and only start sending it traffic once it has been healthy for the required amount of healthy health checks (again, this threshold is defined through ELB Health configuration)
like image 165
Drewch Avatar answered Jan 03 '23 13:01

Drewch


I see two strategies here: put a worker offline for the period, so a balancer will abandon it; inverse control - workers pull for tasks from a balancer, instead of the balancer pushes tasks to workers. Second strategy easy to do with a Message Queue.

like image 20
kan Avatar answered Jan 03 '23 11:01

kan