Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does raft handle committing entries from previous one?

In raft paper section 5.4.2

If a leader crashes before committing an entry, future leaders will attempt to finish replicating the entry. However, a leader cannot immediately conclude that an entry from a previous term is committed once it is stored on a majority of servers. There could be a situation where an old log entry is stored on a majority of servers, yet can still be overwritten by a future leader.

The author mentioned that to avoid the situation above

To eliminate problems like the one in Figure 8, Raft never commits log entries from previous terms by counting replicas. Only log entries from the leader’s current term are committed by counting replicas; once an entry from the current term has been committed in this way, then all prior entries are committed indirectly because of the Log Matching Property.

But wouldn't the same problem still occur?

Given the following situation that the author provided

edge case

When S5 is elected leader it only looks at its current committed log which is (term3, index1) and this is going to override term2 entries in all followers.

How does making a leader looking at its own committed log solve the problem?

like image 750
Jal Avatar asked Sep 14 '17 15:09

Jal


People also ask

What is log replication in raft?

RAFT performs Log Replication through its Log Matching Properties, which states that: If two entries in different logs have same the index and term, then they store the same command. It means, every numbered slot in the logs must hold the same command for all the replicated machines.

How does raft algorithm work?

Raft achieves consensus via an elected leader. A server in a raft cluster is either a leader or a follower, and can be a candidate in the precise case of an election (leader unavailable). The leader is responsible for log replication to the followers.

What is raft's solution to avoid unnecessary runs of the leader election protocol?

Raft uses randomized election timeouts to ensure that split votes are rare and that they are resolved quickly. To prevent split votes in the first place, election timeouts are chosen randomly from a fixed interval (e.g., 150–300ms).

How does a raft election start?

The Raft algorithm divides time into smaller terms of arbitrary lengths. With each term, an election is initiated, and a leader is chosen by majority vote. In case of no majority, the term terminates without any leader. The term number increases monotonically and is exchanged in every communication.


1 Answers

Read the caption on this image. Both (d) and (e) are possible resolutions to the state of the log produced by (a), (b), and (c). The problem is, even though in (c) entry (2, 2) was replicated to a majority of the cluster, this is illustrating that it could still be overwritten when S5 is elected in (d). So, the solution is to only allow nodes to commit entries from their own term. In other words, replicating an entry on a majority of nodes does not equal commitment. In (c), entry (2, 2) is replicated to a majority of the cluster, but because it's not in the leader's term (at least 4) it's not committed. But in (e), after the leader replicates an entry from its current term (4), that prevents the situation in (d) from occurring since S5 can no longer be elected leader.

like image 110
kuujo Avatar answered Sep 28 '22 07:09

kuujo