The CAP Theorem

Brewer's Theorem

The CAP theorem, formulated by Eric Brewer, states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:

Consistency (C): Every read receives the most recent write or an error.
Availability (A): Every request receives a (non-error) response, without the guarantee that it contains the most recent write.
Partition Tolerance (P): The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

✶Nice to know

P is not optional

Because distributed systems are built on asynchronous networks, network partitions are inevitable. Therefore, a distributed system must choose Partition Tolerance. The real choice is between Consistency and Availability during a partition.

System Type	Benefit	Common Example
CP	Consistent + Partition Tolerant	HBase, MongoDB
AP	Available + Partition Tolerant	Cassandra, DynamoDB
CA	Consistent + Available	RDBMS (on a single node)

The Tradeoff: CP vs AP

When a network partition occurs, the nodes on different sides of the partition cannot communicate.

CP (Consistency + Partition Tolerance)

If the system chooses consistency over availability, it must ensure that no stale data is returned. Therefore, nodes that cannot communicate with the primary or a majority of nodes will simply return an error (or time out) rather than return potentially outdated information.

Examples: HBase, MongoDB (default), Redis, Zookeeper.

AP (Availability + Partition Tolerance)

If the system chooses availability over consistency, it will return the most recent version of the data it has, even if it cannot guarantee it is the absolute latest version.

Examples: Cassandra, CouchDB, DynamoDB, Riak.

System Trade-offs Analysis

Visualizing a Network Partition (P)

Client 1

Client 2

Node A

Node B

Node C

System Type	During Partition	Use Case
CP	Returns Error/Timeout	Banking, Billing, Inventory
AP	Returns Stale Data	Social Media Feeds, Shopping Carts

Beyond CAP: The PACELC Theorem

The PACELC theorem, described by Daniel Abadi in 2010, extends the CAP theorem by addressing the tradeoff between latency and consistency even when there is no network partition.

PACELC stands for:

If there is a Partition, how does the system choose between Availability and Consistency?
Else (no partition), how does the system choose between Latency and Consistency?

(3.1)

\text{PACELC: } P \Rightarrow (A \lor C),\; E \Rightarrow (L \lor C)

Common PACELC Classifications

PA/EL: Choose availability during partition, low latency during normal operation (e.g., DynamoDB, Cassandra).
PC/EC: Choose consistency during partition, consistency during normal operation (e.g., BigTable, HBase).
PA/EC: Choose availability during partition, but prioritize consistency during normal operation (e.g., MongoDB with primary-only reads).

✦Intuition

The Latency Tradeoff

Even in a perfectly healthy network, sending data to multiple replicas for consistency takes more time than returning a result from a single local node. This latency tax is the core of the else side of PACELC.

← Previous

Time & Clocks

Course Progression

3 of 10

Distributed Consensus