Generating unique ID seems to be a simple task, but it is not in a high-load distributed systems!
This topic consists of:
Understanding the requirements and why it is a complicated task Possible solutions: Multi-master replication Universally unique identifier (UUID) Ticket server Twitter SNOWFLAKE approach (seems to be the best one!) Details: Timestamp Sequence number Other issues Clock synchronization Section length tuning High availability These items are disclosed in a very interesting Chapter 7 of the book:
Key-Value stores are the most basic but widely used data storages.
Design of key-value store consists of understanding the following topics:
What do we want from key-value store? Single server key-value store DISTRIBUTED key-value store: CAP theorem Real-world trade-offs for distributed systems System components: Data partition Data replication Consistency Inconsistency resolution: Versioning Handling all types of failures: Failure detection, Handling TEMPORARY failures, Handling PERMANENT failures, Handling data center outage System architecture diagram Write path Read path These items are disclosed in a very interesting Chapter 6 of the book:
Consistent Hashing is a cornerstone technology for distributed systems. Many of software developers don’t realize it, but Consistent Hashing is needed in many places: load balancers, caches, CDNs, id generators, databases, chats / social networks, and many other systems.
This topic consists of:
Problem with rehashing and why we need hashing to be CONSISTENT Hash space and hash ring BASIC approach (introduced by Karger et al. at MIT) Advanced approach with VIRTUAL NODES These items are disclosed in a very interesting Chapter 5 of the book: