I had following deployment of sentinel - 3 redis instances on different servers, 3 sentinels on each of these servers.
Now, I realized that the current master does not have much memory, so I stopped sentinel and redis instance on this particular server. And did the same setup on a new machine. SO, still I have the same deployment, 3 redis instances and 3 sentinels.
The issue is that, now sentinels are saying, master is down, as they think the master is the server which I removed. What should I do to tell sentinel that it need not include that server in loop.
The reason of that is sentinel is used for monitoring redis master and replica works as expected, if the master is offline, it will do an automatic failover.
The +sdown event happened 5 seconds after Redis went away; this is due to the sentinel down-after-milliseconds mymaster 5000 config that we set earlier. Sentinel detected that Redis went away, but it didn’t actually do anything in response.
Once connected to the Sentinel, you can ask it for the current master: 127.0.0.1:5000> sentinel get-master-addr-by-name mymaster 1) "127.0.0.1" 2) "6379" The Sentinel process connects to the Redis process to detect whether it’s still available. You can see this from the Redis process, by listing its clients:
The main thing to note is that the config file points to the address of the Redis master, 127.0.0.1:6379 . Now start the first Sentinel from the first config file:
From the docs about Redis Sentinel, under the chapter Adding or removing Sentinels:
Removing a Sentinel is a bit more complex: Sentinels never forget already seen Sentinels, even if they are not reachable for a long time, since we don't want to dynamically change the majority needed to authorize a failover and the creation of a new configuration number. So in order to remove a Sentinel the following steps should be performed in absence of network partitions:
- Stop the Sentinel process of the Sentinel you want to remove.
- Send a
SENTINEL RESET *
command to all the other Sentinel instances (instead of * you can use the exact master name if you want to reset just a single master). One after the other, waiting at least 30 seconds between instances.- Check that all the Sentinels agree about the number of Sentinels currently active, by inspecting the output of
SENTINEL MASTER mastername
of every Sentinel.
Further:
Removing the old master or unreachable slaves.
Sentinels never forget about slaves of a given master, even when they are unreachable for a long time. This is useful, because Sentinels should be able to correctly reconfigure a returning slave after a network partition or a failure event.
Moreover, after a failover, the failed over master is virtually added as a slave of the new master, this way it will be reconfigured to replicate with the new master as soon as it will be available again.
However sometimes you want to remove a slave (that may be the old master) forever from the list of slaves monitored by Sentinels.
In order to do this, you need to send a
SENTINEL RESET mastername
command to all the Sentinels: they'll refresh the list of slaves within the next 10 seconds, only adding the ones listed as correctly replicating from the current masterINFO
output.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With