Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use Consul in leader election?

Tags:

consul

How do I use Consul to make sure only one service is performing a task?

I've followed the examples in http://www.consul.io/ but I am not 100% sure which way to go. Should I use KV? Should I use services? Or should I use a register a service as a Health Check and make it be callable by the cluster at a given interval?

For example, imagine there are several data centers. Within every data center there are many services running. Every one of these services can send emails. These services have to check if there are any emails to be sent. If there are, then send the emails. However, I don't want the same email be sent more than once.

How would it make sure all emails are sent and none was sent more than once?

I could do this using other technologies, but I am trying to implement this using Consul.

like image 573
Alexandre Santos Avatar asked Dec 28 '14 18:12

Alexandre Santos


People also ask

How does consul elect leader?

Consul agent can run in two different modes — Server and Agent. The main responsibilities of the Consul Server are to respond to the queries coming from the agents and to elect the leader. The leadership is selected using the consensus protocol to provide Consistency (as defined by CAP) based on the Raft algorithm.

What is consul leadership?

The Consul leader is elected via an implementation of the Raft Protocol from amongst the Quorum of Consul Servers. Only Consul instances that are configured as Servers participate in the Raft Protocol communication. The Consul Agent (the daemon) can be started as either a Client or a Server.

How do you elect a leader?

One approach to elect a leader in such a structure is known as electoral stages. Similar to procedures in ring structures, this method in each stage eliminates potential candidates until eventually one candidate node is left. This node becomes the leader and then notifies all other processes of termination.


2 Answers

This is exactly the use case for Consul Distributed Locks

For example, let's say you have three servers in different AWS availability zones for fail over. Each one is launched with:

consul lock -verbose lock-name ./run_server.sh

Consul agent will only run the ./run_server.sh command on which ever server acquires the lock first. If ./run_server.sh fails on the server with the lock Consul agent will release the lock and another node which acquires it first will execute ./run_server.sh. This way you get fail over and only one server running at a time. If you registered your Consul health checks properly you'll be able to see that the server on the first node failed and you can repair and restart the consul lock ... on that node and it will block until it can acquire the lock.

Currently, Distributed Locking can only happen within a single Consul Datacenter. But, since it is up to you to decide what a Consul Servers make up a Datacenter, you should be able to solve your issue. If you want locking across Federated Consul Datacenters you'll have to wait for it, since it's a roadmap item.

like image 189
jeremyjjbrown Avatar answered Oct 06 '22 21:10

jeremyjjbrown


First Point: The question is how to use Consul to solve a specific problem. However, Consul cannot solve that specific problem because of intrinsic limitations in the nature of a gossip protocol.

When one datacenter cannot talk to another you cannot safely determine if the problem is the network or the affected datacenter.

The usual solution is to define what happens when one DC cannot talk to another one. For example, if we have 3 datacenters (DC1, DC2, and DC3) we can determine that whenever one DC cannot talk to the other 2 DCs then it will stop updating the database.

If DC1 cannot talk to DC2 and DC3 then DC1 will stop updating the database, and the system will assume DC2 and DC3 are still online.

Let's imagine that DC2 and DC3 are still online and they can talk to each other, then we have quorum to continue running the system.

When DC1 comes online again it will play catch up with the database.

Where can Consul help here? It can communicate between DCs and check if they are online... but so can ICMP.

Take a look at the comments. Did this answer your question? Not really. But I don't think the question has an answer.

Second point: The question is "How to use Consul in leader election?" It would have been better to ask how does Consul elect a new leader. Or "Given the documentation in Consul.io, can you give me an example on how to determine the leader using Consul".

If that is what you really want, then the question was already answered: How does a Consul agent know it is the leader of a cluster?

like image 22
user1293962 Avatar answered Oct 06 '22 23:10

user1293962