Key-Value stores are the most basic but widely used data storages.
Design of key-value store consists of understanding the following topics:
What do we want from key-value store? Single server key-value store DISTRIBUTED key-value store: CAP theorem Real-world trade-offs for distributed systems System components: Data partition Data replication Consistency Inconsistency resolution: Versioning Handling all types of failures: Failure detection, Handling TEMPORARY failures, Handling PERMANENT failures, Handling data center outage System architecture diagram Write path Read path These items are disclosed in a very interesting Chapter 6 of the book:
Consistent Hashing is a cornerstone technology for distributed systems. Many of software developers don’t realize it, but Consistent Hashing is needed in many places: load balancers, caches, CDNs, id generators, databases, chats / social networks, and many other systems.
This topic consists of:
Problem with rehashing and why we need hashing to be CONSISTENT Hash space and hash ring BASIC approach (introduced by Karger et al. at MIT) Advanced approach with VIRTUAL NODES These items are disclosed in a very interesting Chapter 5 of the book:
Four standard steps for system design interview. However, I would think about them wider: as about four initial steps to design the software.
Step 1. Understand the problem and establish design scope Step 2. Propose high-level design and get buy-in Step 3. Design deep dive Step 4. Wrap up The chapter 3 of the book discovers details about each step, good questions to ask (to think about), DO’s and DONT’s.
A great generic plan for scaling any app from zero to millions of users.
Single server setup Selection and usage of database Vertical scaling vs horizontal scaling approaches. And why you should prefer horizontal Adding load balancer for horizontal scaling Adding database replication for horizontal scaling Adding cache Adding CDN Stateless vs Stateful architecture and using external state storage Adding extra Data Centers Adding Message queue Adding Logging, Metrics, and Automation Scaling database (sharding) and futher steps… All of these is carefully but briefly disclosed in the Chapter 1 of the book:
The book club of our company has chosen a new wonderful book for reading:
Robert Martin - Clean Architecture - a Craftsman’s Guide to Software Structure and Design
The part VI undermines some foundations 😀:
Do you know that Database is a “detail”? An unimportant minor low-level non-essential feature that can be neglected in architecture design! Do you know the same about the Web? It is just an unimportant IO device that should also be neglected in architecture design!
Earlier this year the book club of our company has studied excellent book:
Martin Kleppmann - Designing Data-Intensive Applications
This is the best book I have read about building complex scalable software systems. 💪
As usually (to better learn) I prepared an overview and mind-map.
Building blocks of the apps What is Reliability, Scalability and Maintainability. Examples and definitions. Faults and Failures Performance, Load, Latency and Response Time Operability, Simplicity, Evolvability Why you should randomly kill your servers 😅 How Twitter delivers 12,000 tweets per second to 300,000 readers per second.