CAP Theorem

Once your data lives on more than one machine, a hard question appears: what happens when those machines can't talk to each other? The CAP theorem is the classic answer. It says that when the network between your servers breaks, you're forced into an uncomfortable choice — and pretending otherwise leads to systems that quietly corrupt data or fall over at the worst time.

It's one of the most cited — and most misquoted — ideas in distributed systems. The good news is the core idea is simple once you stop thinking of it as "pick any two."

What the three letters mean

CAP stands for three properties a distributed data system might have:

Consistency (C) — every read sees the most recent write. Ask any node for a value and you get the same, up-to-date answer. (This is a stronger promise than the "C" in database ACID — here it means all nodes agree on the latest data.)
Availability (A) — every request gets a non-error response. The system always answers, even if some nodes are down. It never makes you wait forever or refuses you outright.
Partition tolerance (P) — the system keeps operating even when the network drops or delays messages between nodes, splitting them into groups that can't reach each other. This split is called a partition.

The actual theorem

Here's the part people get wrong. In any system that spans a network, partitions will happen — cables get cut, switches fail, packets get lost. You don't get to opt out of partition tolerance; the network decides, not you. So P is a given, not a choice.

That leaves the genuine trade-off: when a partition is in progress, a node that's cut off from its peers must decide what to do with an incoming request. Either it answers using only what it can see locally — risking giving out stale data (choosing Availability) — or it refuses to answer until it can confirm it's up to date (choosing Consistency). It cannot do both at once. That is the whole theorem.

How it works

Picture two database replicas, Node A and Node B, with a client reading and writing data. Normally life is easy: a write lands on A, A copies it over to B, and a read from either node returns the same fresh value. The two nodes stay in sync because they can talk.

Now the link between A and B is severed — a partition. A and B are both still alive and still receiving requests, but they can no longer share updates. A write to A never reaches B. At this instant the system is forced into an either/or: when the client asks the cut-off node for data, that node can stay consistent by refusing to answer (it can't be sure it has the latest write), or it can stay available by answering with whatever it has — possibly stale. The animation below walks through normal replication, then the partition, then both sides of that forced choice.

Consistency vs. availability under a partition

write / read

Client

Node A

Node B

Normally, a write to one node replicates to the other — every read is consistent.

Note

"Pick 2 of 3" is the myth. You can't trade away partition tolerance, because partitions aren't a feature you enable — they're a fact of networks. So the real menu has only two items, and you only order from it during a partition: do you want CP (consistent but maybe unavailable) or AP (available but maybe stale)?

CP vs AP, in plain terms

CP — choose Consistency, sacrifice Availability. The cut-off node would rather say "I can't help you right now" than risk handing back wrong data. Think of a bank balance: if Node B can't confirm whether a withdrawal happened on Node A, it should reject the request rather than let you spend money you no longer have. Correctness matters more than always answering.

AP — choose Availability, sacrifice Consistency. The node answers no matter what, accepting that the data might be slightly out of date and will be reconciled later. Think of the like count on a social post: if it shows 1,042 instead of 1,043 for a few seconds during a network blip, nobody is harmed. Staying responsive matters more than being exactly right.

Clearing up the "pick 2" misconception

The popular shorthand — "CAP means choose two of the three" — is misleading in two ways.

First, P is non-negotiable. A single-machine database can ignore CAP entirely, but the moment data is distributed across a network, partition tolerance is forced on you. So the choice is never "C and A together" by simply dropping P — dropping P just means you've built a system that breaks badly when the inevitable partition arrives.

Second, the trade-off is conditional, not permanent. When the network is healthy and nodes can talk, a good system delivers both consistency and availability. The C-versus-A decision only has to be made during a partition. A system labeled "CP" isn't always unavailable, and an "AP" system isn't always stale — those labels just describe which property it gives up while a partition is happening.

Watch out

Don't assume a partition is rare enough to ignore. Brief partitions — a flaky switch, an overloaded link, a slow cross-region hop — happen constantly at scale, and they're hardest to handle precisely when traffic is high. If you never decided whether you're CP or AP, the system decides for you, often by silently returning stale data or by hanging until requests time out.

When each choice fits

Reach for CP when stale or conflicting data is unacceptable: money, inventory counts, unique-username checks, anything where two disagreeing answers could cause real harm. You accept that some requests will fail during a partition in exchange for never being wrong.

Reach for AP when staying responsive matters more than perfect freshness: feeds, view and like counts, product catalogs, caches, recommendations. A little temporary disagreement is fine because the data converges once the partition heals.

Note that CAP is about partition behavior only; latency and throughput are separate concerns you tune with techniques like load balancing. Most real systems also aren't purely one or the other — they pick CP for the operations that need it and AP for the ones that don't.