Designing Data-Intensive Applications

Earlier the book club of our company has studied excellent book: Martin Kleppmann - Designing Data-Intensive Applications This is the best book I have read about building complex scalable software systems. 💪 As usually I prepared an overview and mind-map. Chapter 12 is a summary of the book and a visionary view of the future. Data Integration. Overview of the ways we have to integrate data. Causality and why we need Total Order and Idempotency.

Designing Data-Intensive Applications - Chapter 11 - Stream Processing

Earlier the book club of our company has studied excellent book: Martin Kleppmann - Designing Data-Intensive Applications This is the best book I have read about building complex scalable software systems. 💪 As usually I prepared an overview and mind-map. Chapter 11 discovers all aspects about Stream Processing. If your system needs to process some data on-the-fly then your DEV team should learn this info. Approaches for transmitting events: Direct messaging, Messaging Systems and Partitioned Logs.

Designing Data-Intensive Applications - Chapter 10 - Batch Processing

Earlier the book club of our company has studied excellent book: Martin Kleppmann - Designing Data-Intensive Applications This is the best book I have read about building complex scalable software systems. 💪 As usually I prepared an overview and mind-map. Chapter 10 discovers all aspects about big data Batch Processing. If your system needs to process some data then your DEV team should learn this info. Unix tools for batch processing and brilliant concept of pipes.

Designing Data-Intensive Applications - Chapter 9 - Consistency and Consensus

Earlier the book club of our company has studied excellent book: Martin Kleppmann - Designing Data-Intensive Applications This is the best book I have read about building complex scalable software systems. 💪 As usually I prepared an overview and mind-map. Chapter 9 tells about Consistency and Consensus in distributed systems. It covers the following topics: What is consistency and eventual consistency Linearizability. Why it is needed. Difference from Serializability.

Designing Data-Intensive Applications - Chapter 8 - The Trouble with Distributed Systems

Earlier the book club of our company has studied excellent book: Martin Kleppmann - Designing Data-Intensive Applications This is the best book I have read about building complex scalable software systems. 💪 As usually I prepared an overview and mind-map. Chapter 8 discovers non-database related problems of distributed systems. DEV teams should consider them when designing distributed software. Faults and Partial Failures. The need to build a reliable system from unreliable components.

Designing Data-Intensive Applications - Chapter 7 - Transactions

Earlier the book club of our company has studied excellent book: Martin Kleppmann - Designing Data-Intensive Applications This is the best book I have read about building complex scalable software systems. 💪 As usually I prepared an overview and mind-map. Chapter 7 is all your DEV team should know about Transactions: The purpose of transactions The concept of transaction: ACID, BASE, single-object and multi-object transactions Weak Isolation Levels: Read Committed, Snapshot Isolation and Repeatable Read.

Designing Data-Intensive Applications - Chapter 6 - Partitioning

Earlier the book club of our company has studied excellent book: Martin Kleppmann - Designing Data-Intensive Applications This is the best book I have read about building complex scalable software systems. 💪 As usually I prepared an overview and mind-map. Chapter 6 contains everything the DEV team should consider when designing storage for big data: Partition aka Shard aka Region aka Tablet aka vNode aka vBucket. It is another approach for storing the data in addition to Replication (reviewed in the previous chapter) How to partition key-value data (primary index).

Designing Data-Intensive Applications - Chapter 5 - Replication

Earlier the book club of our company has studied excellent book: Martin Kleppmann - Designing Data-Intensive Applications This is the best book I have read about building complex scalable software systems. 💪 As usually I prepared an overview and mind-map. Chapter 5: Intro. How to scale apps. Replicating and partitioning. Three algos of replicating Single-leader Replication Leaders and Followers Sync and async replication Adding new Followers Handling node outages Technical implementations and all potential problems Multi-Leader Replication Use-cases when it is good Handling write conflicts Three topologies and potential problems Leaderless Replication Writing to the database when a node is down Quorums and problems with them Detecting concurrent writes and how to resolve them Download full mind map (PDF)

Designing Data-Intensive Applications - Chapter 4 - Encoding and Evolution

Earlier this year the book club of our company has studied excellent book: Martin Kleppmann - Designing Data-Intensive Applications This is the best book I have read about building complex scalable software systems. 💪 As usually I prepared an overview and mind-map. Chapter 4: What is evolvability. Backward and Forward compatibility Approaches to encode data: JSON, XML, and their binary variants Thrift and Protobuf Apache Avro Models of data flow Through databases Through services: REST, SOAP, RPC and the future Through message brokers - when they are better and when they are not Much more details in the mind-map:

Designing Data-Intensive Applications - Chapter 3 - Storage and Retrieval

Earlier this year the book club of our company has studied excellent book: Martin Kleppmann - Designing Data-Intensive Applications This is the best book I have read about building complex scalable software systems. 💪 As usually I prepared an overview and mind-map. Chapter 3: Data structures: Log-structured. SSTables / LSM-trees (when we don’t update anything but write to the end). A very cool idea of how to store data.