In a distributed system, consistency refers to the degree to which all replicas of a data object agree. It's not a binary choice but a spectrum of guarantees that trades off performance for correctness.
A consistency model is a contract between the system and its users. It defines what data the user is guaranteed to receive when they perform a read operation.
Linearizability is the gold standard for consistency. It makes the system behave as if there were only a single copy of the data, and all operations were instantaneous.
- Once a write is completed, every subsequent read (no matter which node it queries) must return the new value.
- If one node sees a value, all other nodes must also see it.
Linearizability: Read-after-write
Linearizability is extremely expensive. It requires global coordination and is highly sensitive to network latency and partitions (as seen in the CAP theorem).
Because strong consistency is hard to scale, many systems use weaker models.
Eventual Consistency
The system guarantees that if no new writes are made, all replicas will eventually converge to the same value.
- Pro: High availability and performance.
- Con: Users might see stale data for an undefined period.
Causal Consistency
Operations that are causally related (e.g., a reply to a message) must be seen in the same order on all nodes. Concurrent operations can be seen in any order.
| Model | Guarantee | Performance |
|---|---|---|
| Linearizability | Single-copy behavior | Very Low |
| Causal | Ordered related events | Medium |
| Eventual | Convergence over time | Very High |
Use Linearizability for financial transactions or distributed locks. Use Eventual Consistency for social media feeds or search indices where a few seconds of lag is acceptable.
In the next chapter, we will discuss how nodes discover each other and detect failures in a dynamic cluster.