Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redis sentinel - How to take a server out of loop?

I had following deployment of sentinel - 3 redis instances on different servers, 3 sentinels on each of these servers.

Now, I realized that the current master does not have much memory, so I stopped sentinel and redis instance on this particular server. And did the same setup on a new machine. SO, still I have the same deployment, 3 redis instances and 3 sentinels.

The issue is that, now sentinels are saying, master is down, as they think the master is the server which I removed. What should I do to tell sentinel that it need not include that server in loop.

like image 915
Tarun Avatar asked Feb 29 '16 12:02

Tarun


People also ask

What is the use of Sentinel in Redis?

The reason of that is sentinel is used for monitoring redis master and replica works as expected, if the master is offline, it will do an automatic failover.

Why did Redis go away after 5 seconds?

The +sdown event happened 5 seconds after Redis went away; this is due to the sentinel down-after-milliseconds mymaster 5000 config that we set earlier. Sentinel detected that Redis went away, but it didn’t actually do anything in response.

How do I get the current Master of a Sentinel process?

Once connected to the Sentinel, you can ask it for the current master: 127.0.0.1:5000> sentinel get-master-addr-by-name mymaster 1) "127.0.0.1" 2) "6379" The Sentinel process connects to the Redis process to detect whether it’s still available. You can see this from the Redis process, by listing its clients:

Where does the Redis config file point to?

The main thing to note is that the config file points to the address of the Redis master, 127.0.0.1:6379 . Now start the first Sentinel from the first config file:


1 Answers

From the docs about Redis Sentinel, under the chapter Adding or removing Sentinels:

Removing a Sentinel is a bit more complex: Sentinels never forget already seen Sentinels, even if they are not reachable for a long time, since we don't want to dynamically change the majority needed to authorize a failover and the creation of a new configuration number. So in order to remove a Sentinel the following steps should be performed in absence of network partitions:

  1. Stop the Sentinel process of the Sentinel you want to remove.
  2. Send a SENTINEL RESET * command to all the other Sentinel instances (instead of * you can use the exact master name if you want to reset just a single master). One after the other, waiting at least 30 seconds between instances.
  3. Check that all the Sentinels agree about the number of Sentinels currently active, by inspecting the output of SENTINEL MASTER mastername of every Sentinel.

Further:

Removing the old master or unreachable slaves.

Sentinels never forget about slaves of a given master, even when they are unreachable for a long time. This is useful, because Sentinels should be able to correctly reconfigure a returning slave after a network partition or a failure event.

Moreover, after a failover, the failed over master is virtually added as a slave of the new master, this way it will be reconfigured to replicate with the new master as soon as it will be available again.

However sometimes you want to remove a slave (that may be the old master) forever from the list of slaves monitored by Sentinels.

In order to do this, you need to send a SENTINEL RESET mastername command to all the Sentinels: they'll refresh the list of slaves within the next 10 seconds, only adding the ones listed as correctly replicating from the current master INFO output.

like image 55
Linus Thiel Avatar answered Sep 18 '22 22:09

Linus Thiel