Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Health check to detect redis master from google tcp load balancer

I am trying to setup a google TCP internal Load Balancer. Instance group behind this lb consists of redis-server processes listening on port 6379. Out of these redis instances, only one of them is master.

Problem: Add a TCP health check to detect the redis master and make lb divert all traffic to redis master only.

Approach: Added a TCP Health Check for the port 6379. In order to send the command role to redis-server process and parse the response, I am using the optional params provided in the health check. Please check the screenshot here.

Result: Health check is failing for all. If I remove the optional request/response params, health check starts passing for all.

Debugging:

  1. Connected to lb using netcat and issued the command role, it sends the response starting with *3(for master) and *5(for slave) as expected.
  2. Logged into instance and stopped redis-server process. Started listening on port 6379 using nc -l -p 6379 to check what exactly is being received at the instance's side in the health check. It does receive role\r\n.
  3. After step 2, restarted redis-server and ran MONITOR command in redis-cli, to watch log of commands received by this process. Here there is no log of role. This means, instance is receiving the data(role\r\n) over tcp but is not received by the process redis-cli(as per MONITOR command) or something else is happening. Please help.
like image 688
Yadvendar Avatar asked Mar 10 '23 10:03

Yadvendar


1 Answers

Unfortunately GCP's TCP health checks is pretty limited on what can be checked in the response. From https://cloud.google.com/sdk/gcloud/reference/compute/health-checks/create/tcp:

--response=RESPONSE
 An optional string of up to 1024 characters that the health checker expects to receive from the instance. If the response is not received exactly, the health check probe fails. If --response is configured, but not --request, the health checker will wait for a response anyway. Unless your system automatically sends out a message in response to a successful handshake, only configure --response to match an explicit --request.

Note the word "exactly" in the help message. The response has to match the provided string in full. One can't specify a partial string to search for in the response.

As you can see on https://redis.io/commands/role, redis's ROLE command returns a bunch of text. Though the substring "master" is present in the response, it also has a bunch of other text that would vary from setup to setup (based on the number of slaves, their addresses, etc.).

You should definitely raise a feature request with GCP for regex matching on the response. A possible workaround until then is to have a little web app on each host that performs "redis-cli role | grep master" command locally and return the response. Then the health check can be configured to monitor this web app.

like image 148
Praveen Yalagandula Avatar answered Apr 06 '23 12:04

Praveen Yalagandula