I am trying to setup a google TCP internal Load Balancer. Instance group behind this lb consists of redis-server processes listening on port 6379. Out of these redis instances, only one of them is master.
Problem: Add a TCP health check to detect the redis master and make lb divert all traffic to redis master only.
Approach:
Added a TCP Health Check for the port 6379.
In order to send the command role
to redis-server process and parse the response, I am using the optional params provided in the health check. Please check the screenshot here.
Result: Health check is failing for all. If I remove the optional request/response params, health check starts passing for all.
Debugging:
role
, it sends the response starting with *3
(for master) and *5
(for slave) as expected.nc -l -p 6379
to check what exactly is being received at the instance's side in the health check. It does receive role\r\n
.MONITOR
command in redis-cli, to watch log of commands received by this process. Here there is no log of role
.
This means, instance is receiving the data(role\r\n
) over tcp but is not received by the process redis-cli(as per MONITOR
command) or something else is happening. Please help.Unfortunately GCP's TCP health checks is pretty limited on what can be checked in the response. From https://cloud.google.com/sdk/gcloud/reference/compute/health-checks/create/tcp:
--response=RESPONSE
An optional string of up to 1024 characters that the health checker expects to receive from the instance. If the response is not received exactly, the health check probe fails. If --response is configured, but not --request, the health checker will wait for a response anyway. Unless your system automatically sends out a message in response to a successful handshake, only configure --response to match an explicit --request.
Note the word "exactly" in the help message. The response has to match the provided string in full. One can't specify a partial string to search for in the response.
As you can see on https://redis.io/commands/role, redis's ROLE command returns a bunch of text. Though the substring "master" is present in the response, it also has a bunch of other text that would vary from setup to setup (based on the number of slaves, their addresses, etc.).
You should definitely raise a feature request with GCP for regex matching on the response. A possible workaround until then is to have a little web app on each host that performs "redis-cli role | grep master" command locally and return the response. Then the health check can be configured to monitor this web app.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With