System design, explained with pizza.
No walls of text. No jargon first. We start with a pizza shop and map every concept to the real world. By the end, you'll never forget how these systems work.
The Fuse Box
The oven keeps tripping the fuse. Instead of calling the electrician every 5 minutes, you install a circuit breaker. After 3 trips, it stays open for 30 seconds, giving the oven a break. Then it tries again. If it works, great. If not, it opens again.
A circuit breaker stops sending requests to a failing service. After N failures, it 'opens' and returns an error immediately. After a timeout, it 'half-opens' to test if the service recovered. Prevents cascading failures.
Fuse box = circuit breaker. 3 strikes = stop calling the broken service. Wait 30s = half-open test. Still broken? Stay open.
Your service calls the Payment API. It's down. Every request waits 30 seconds before timing out. Soon your threads are all stuck waiting. What problem is this?
- โ"How do you stop a slow downstream service from killing your service?" โ Circuit breaker + timeouts + bulkheads. Fail fast, isolate thread pools per dependency.
- โ"Tune the breaker for me." โ Trip after ~50% errors in a 10s window with min 20 requests. Stay OPEN for 30s, then HALF-OPEN with 1 probe. (Hystrix/Resilience4j defaults.)
- โ"What's a thundering herd?" โ 10k clients retry the EXACT same instant after an outage and DDoS the recovering service. Fix: exponential backoff + jitter.
- โ"Where do you place the breaker โ client or server?" โ On the CLIENT side, around outbound calls. Each caller decides when its dependency is unhealthy.
Close your eyes and picture the pizza shop. Walk through the story in your head. If you can narrate the pizza version, you can explain the technical version. The analogy is your anchor โ the jargon is just a vocabulary layer on top.