Designing Data-Intensive Applications - Chapter 6 - Partitioning

Designing Data-Intensive Applications - Chapter 6 - Partitioning

Earlier the book club of our company has studied excellent book:

Martin Kleppmann - Designing Data-Intensive Applications

This is the best book I have read about building complex scalable software systems. 💪

As usually I prepared an overview and mind-map.

Chapter 6 contains everything the DEV team should consider when designing storage for big data:

  • Partition aka Shard aka Region aka Tablet aka vNode aka vBucket. It is another approach for storing the data in addition to Replication (reviewed in the previous chapter)
  • How to partition key-value data (primary index). Problems with partitioning - skew and hotspot. Approaches: key range and hash of key.
  • Partitioning for secondary indexes: Local index and Global index
  • Rebalancing partitions as you grow. Bad and good aproaches, problems and how to deal with them. Manual vs automated rebalancing.
  • Request routing. Different aproaches, issues and solutions.

Download full mind map (PDF)

See also: