Deterministic Stream Processing is a foundation for building scalable event-driven systems. The lack of determinism can be very painful for the business (imagine lost financial transactions, missed alerts, wrong data aggregation).
There are certain tips and tricks on how to implement determinism. Keywords are:
Timestamps Event Scheduling Watermarks Stream Time Also you should understand the nature of Out-of-Order and Late-Arriving Events. And strategies for handling them.
Also need to support reprocessing.
How to talk to your users when you are building a Startup? What is the best book to read on this topic? Three common mistakes everybody makes. Five great questions you can ask in every user interview. How to talk to users during three stages: idea stage, prototype stage and launched stage.
All of this is disclosed in Y Combinator’s Startup School - Lesson “How to Talk to Users”.
As usually, here is my summary mind map:
Overviewing basics of event processing in Event-Driven Architectures:
Typical structure of microservice Typical types of event transformations, 2 branching scenarios, merging streams Repartitioning events and when it can be useful Copartitioning events and when it is needed Assigning Partitions to a Consumer Instance. Three strategies to do this. Recovering from stateless processing instance failures. These topics are disclosed in the Chapter 5 of the book we are currently studying:
Data Liberation is the process of moving from monolith towards microservices by decoupling systems in terms of data dependencies.
There are three patterns for Data Liberation:
Query-based Log-based Table-base Each pattern has its own pros and cons, as well as other important considerations.
Data definition changes (data structure migrations) must also be supported by the chosen Data Liberation approach.
There are different Liberation frameworks/tools that simplify the process of Data Liberation.