CHAPTER 02
3 DIAGRAMS · ~9 MIN
Databases at Scale
The database is where your bugs become permanent. Schema, replication, and index decisions outlast every other architectural choice.
02.1 · CONCEPT
Replication
Copy the data so reads scale and a single node failure isn't fatal. The cost is consistency — replicas always lag, and the lag is a feature, not a bug.
ASYNC vs SYNC
Async replication is the default: low write latency, replicas may be seconds behind. Sync waits for at least one replica — survives loss but doubles tail latency.
READ-YOUR-WRITES
Route a user's reads to the primary for N seconds after their write, or until the replicas catch up. Otherwise users see stale data and file bug reports.
FAILOVER
Promote a replica when the primary dies. The dangerous case is split-brain: old primary comes back, both accept writes. Use a coordinator (etcd, Sentinel) to fence.
FIG · 02.1
02.2 · CONCEPT
Sharding
When a single primary can't hold the dataset or absorb the writes, split it. The choice of shard key is the single most important — and irreversible — schema decision.
SHARD KEY
Pick a key with high cardinality and even access. user_id beats country (US is hot). Avoid timestamps (today's shard takes all writes).
CONSISTENT HASHING
Hash-mod-N rebalances 100% of keys when N changes. Consistent hashing moves only 1/N. Use it the moment you might add a shard.
CROSS-SHARD QUERIES
JOINs across shards are scatter-gather: query every shard, merge results. Slow and fragile. Denormalise or maintain a search index instead.
FIG · 02.2
02.3 · CONCEPT
Indexes & Trade-offs
An index is a cached copy of one query plan, kept in sync by the database on every write. Powerful and expensive — most production databases have too many.
READ vs WRITE
Every index turns one row write into N+1 writes. On a hot insert path, six secondary indexes can dominate IO.
COVERING INDEXES
If the index contains every column the query needs, the engine never touches the table. Huge win for hot read paths — at the cost of disk.
RIGHT TOOL
B-tree for range scans. Hash for equality. GIN for arrays/JSON. Bitmap for low-cardinality columns. The engine cannot fix a wrong choice.
FIG · 02.3
← PREVIOUS · 01
Scalability Fundamentals
NEXT · 03 →
Distributed Systems