Designing Data-Intensive Applications - Chapter 10 - Batch Processing

2022-01-07

Translations: RU

Designing Data-Intensive Applications - Chapter 10 - Batch Processing

Earlier the book club of our company has studied excellent book:

Martin Kleppmann - Designing Data-Intensive Applications

This is the best book I have read about building complex scalable software systems. 💪

As usually I prepared an overview and mind-map.

Chapter 10 discovers all aspects about big data Batch Processing. If your system needs to process some data then your DEV team should learn this info.

Unix tools for batch processing and brilliant concept of pipes.
MapReduce and Distribute File Systems. How this approach solves problems of Unix pipes. Fault Tolerance and Partitioning. Usage and implementations of Joins, Grouping, Mapping. Available tools and problems of this approach.
What is beyond MapReduce. Dataflow engines, Graph processing, High-level APIs and MPP databases. Dealing with Fault Tolerance and Partitioning. Implementations, problems, what to use and when.

Download full mind map (PDF)

See also: