I have an HTTP application with standalone workers that perform well. The issue is that some times they need to purge and rebuild their caches, so they stop responding for up to 30 seconds.
I have looked into a number of load balancers, but none of them seem to address this issue. I have tried Perlbal and some Apache modules (like fcgid) and they happily send requests to workers that are busy rebuilding their cache.
So my take is this: isn't there some kind of message bus solution where all http requests are queued up, leaving it up to the workers to process messages when they are able to?
Or - alternatively - a load balancer that can take into account that the workers are some times unable to respond.
Added later: I am aware that a strategy could be that the workers could use a management protocol to inform the load balancer when they are busy, but that solution seems kludgy and I worry that there will be some edge cases that results in spurious errors.
If you use Amazon Web Services Load Balancer you can achieve your desired result. You can mark an EC2 Instance behind an Elastic Load Balancer (ELB) as unhealthy while it does this cache purge and rebuild.
What I would do is create an additional endpoint for each instance, that is called rebuild_cache
for example. So if you have 5 instances behind your ELB, you can make a script to hit each individual instance (not through the load balancer) on that rebuild_cache
endpoint. This endpoint would do 3 things:
I see two strategies here: put a worker offline for the period, so a balancer will abandon it; inverse control - workers pull for tasks from a balancer, instead of the balancer pushes tasks to workers. Second strategy easy to do with a Message Queue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With