The Consensus Problem

Consensus is the process of getting multiple nodes to agree on a single data value or a sequence of operations. This is critical for building replicated state machines, like a distributed database where all copies of the data must reflect the same operations in the same order.

DDefinition 4.1
Consensus Properties

A consensus algorithm must satisfy:

  1. Termination: Every non-faulty process must eventually decide on some value.
  2. Agreement: All non-faulty processes that decide must decide on the same value.
  3. Validity: The decided value must have been proposed by some process.
Paxos & Raft

Historically, Paxos (Leslie Lamport, 1990) was the gold standard for consensus. However, it was famously difficult to understand and implement correctly. Raft (Diego Ongaro and John Ousterhout, 2014) was created specifically to be more understandable than Paxos while providing the same safety guarantees.

Consensus Message Complexity (Low is Better)

0 msgs3 msgs7 msgs10 msgs13 msgsRaft2 msgsPaxos3 msgsPBFT12 msgs

The Raft Algorithm

Raft achieves consensus by electing a Leader. The leader takes full responsibility for managing the replicated log.

Typical Raft Cluster State Distribution

Follower (80%)
Leader (20%)
  1. Leader Election : Nodes are either Leaders, Followers, or Candidates. If a Follower hears no heartbeats for a randomized timeout, it becomes a Candidate and requests votes.
  2. Log Replication: The Leader accepts client requests, appends them to its log, and sends AppendEntries RPCs to Followers.
  3. Commitment: Once a majority of Followers acknowledge the entry, the Leader commits it and applies it to its state machine.
TTheorem 4.2
Majority Quorum

In a system of NN nodes, consensus algorithms like Raft require a strict majority (quorum) of (N/2)+1(N/2) + 1 nodes to agree. This ensures that any two quorums must overlap by at least one node, preventing "split brain" scenarios.

EExample 4.3
Fault Tolerance

If you have a 5-node Raft cluster, it requires (5/2)+1=3(5/2) + 1 = 3 nodes to form a quorum. This means the cluster can continue to operate even if 2 nodes fail.

In the next chapter, we will build upon these foundations to explore how to manage large-scale data through replication.