Designing Data-Intensive Applications
Editorial pickThe Big Ideas Behind Reliable, Scalable, and Maintainable Systems
By Martin Kleppmann · O'Reilly Media · 2017
The single textbook every backend engineer should own. Survives every fashionable database wave.
Editorial take
Kleppmann's book is a generational textbook. It explains the fundamental ideas behind storage, replication, partitioning, consistency, consensus, and stream processing in a way that's neither tied to a specific database vendor nor allergic to detail. If you've ever wondered what "linearizability" actually means, what trade-offs Kafka is really making, or why your shiny new database vendor is hand-waving the consensus question, this is the book. Dense, but never unnecessarily so. It's the rare technical book that holds up a decade after publication because it's mostly about ideas, not products.
Last hand-checked 2026-05-18.
Read if you …
- design or operate distributed systems at any non-trivial scale
- are a senior backend engineer studying for system-design interviews
- evaluate database / streaming vendors and want to see through marketing
Skip if you …
- you're a frontend / mobile engineer — most of this won't apply day-to-day
- you want hands-on tutorials with running code — this is principles-first
If you only read one chapter
Consistency and Consensus
Chapter 9 is the cleanest treatment of distributed-consensus trade-offs you'll find outside academic papers. Worth the price alone.
Key ideas
- Reliability, scalability, and maintainability are the three north stars — name them explicitly.
- Replication, partitioning, and transactions are orthogonal — design each one deliberately.
- Stream and batch processing are increasingly the same thing, viewed at different cadences.
- Every database vendor's pitch hides a CAP-style trade-off; learn to find it.
About the book
Kleppmann taught at Cambridge before writing this book at O'Reilly. The result is the most ambitious recent attempt to synthesize the distributed-systems literature for working engineers — covering relational and non-relational storage, replication protocols, transactional models, MapReduce-and-after batch processing, and stream processing with Kafka-class systems, with footnotes leading to the relevant academic papers throughout.
The 2nd edition is in progress; the 1st (2017) is still the canonical version. Treat it as both a textbook and a reference — most working engineers read it once linearly, then keep it on the shelf for the next time they're evaluating a database.
Pairs with
If Designing Data-Intensive Applications works for you, these likely will too.
A Philosophy of Software Design
PickJohn Ousterhout · 2018
The book that taught a generation of senior engineers a new vocabulary for 'why does this code feel bad?'
Read if you are a senior or staff engineer formalizing your taste in code review.
framework<200pintermediateStaff Engineer
PickWill Larson · 2021
The first serious book on senior IC engineering careers — long overdue, immediately definitive.
Read if you are a senior engineer deciding between management and senior IC tracks.
framework200–350pintermediateThe Software Engineer's Guidebook
PickGergely Orosz · 2023
The most up-to-date map of the modern engineering career ladder, written by someone who actually walked it.
Read if you are 3–10 years into engineering and trying to decide what 'good' looks like at the next level.
framework350p+intermediate