CQRS/ES: HOW TO ACHIEVE A GOOD EVENT GRANULARITY?
Dec 18, 2024
If you’ve already developed a software using the event sourcing pattern, you’ve probably faced difficulty: How-to design good events? What is a good event granularity?
Indeed it’s difficult to produce good events that will not harm our design. As a seasoned developer with event sourcing, I’m still struggling with this, even if I’ve developed several heuristics over time.
In this blog post, I will share with you these heuristics. But keep in mind this is not some kind of best practices. Best practices are useful for contexts where we can apply a method without any (major) form of adaptation, there’s nothing that simple when developing a custom software for business. The following heuristics are rather a way to ask ourselves good questions and drive our thinking.
EVENTS OWNERSHIP
I’ve mentioned event sourcing, but what I’m thinking of is a CQRS/ES implementation. Event sourcing is about persistence, but associated with CQRS, events have a double responsibility:
- Events represent the decisions (and the associated information) we want to store.
- Events are a communication contract for elements inside a CQRS/ES context. Yes, CQRS/ES is also event-driven.
An event is commonly associated with an aggregate. This is true, it is the aggregate’s responsibility to emit these events. But an event belongs to a business context because other actors will consume it to produce effects.
Tip: An event is an implementation detail in a given context, don’t use them as a contract for cross-context communication, use dedicated messages instead.
DEFINING EFFECTS
In a CQRS/ES implementation, we are emitting events to express decisions we made. By applying these decisions, we’re producing effects. I can think of three categories of effects:
- update the state of the emitter aggregate
- update dedicated system projections (aka readmodels)
- trigger new processes (send emails, generate files, apply new commands, etc.)
Each effect has his own data requirements, sometimes we can reuse an event for several effects, sometimes we’ll need dedicated events.
One of the first things to do is identifying the effects we want to produce.
Tip: We’re storing information in the aggregate’s state for future decision-making. Sometimes we can replace a data provided by a command with a data stored in an event of the aggregate’s history. Anticipating future effects (and associated information) can highly simplify our software.
AUTONOMOUS EVENTS
Good events are autonomous events. This means they carry all the data they need to apply an effect (ideally). In other words, when applying an event we’re not supposed to compute any data, we should only do some mapping and aggregation logic. There is a good reason for this. By storing events, we’re storing decisions over time, these decisions are associated with business rules. If we’re missing some data in the event, applying a business rules to fill the gap is a potential issue because we’re applying the actual version of this rule, not the one that was applied then the event was emitted.
Here’s an example: we’re running a business and selling a service to our customers. When issuing an invoice, we chose to only store the amount without taxes. At first glance, this looks like a harmless design decision. But when it’s time to pay taxes, we’ll need to compute how much we’ve perceived from our customers. Problem: the tax rate to apply have possibly changed over time, maybe only for customers of a specific region, etc. This can get complicated very quickly. That’s why we want to include the rate and amount of taxes in our event.
I believe autonomous events can be achieved for my first two categories of effects (aggregate’s state and readmodels), but it’s not always possible for the third one (triggering new processes). Sometimes, the effect we’re triggering need information from a larger scope than the scope controlled by the aggregate. In these cases, we read information from dedicated readmodels.
Tip: It’s OK to repeat the same information in several events.
BUSINESS INTENTS
So far, we’ve talked about effects on the system. An effect is how the state of our system changes, it’s a side effect. But observing an effect does not tell us why it occurs, for that we must capture intents. Indeed, there are several reasons for our system to send an email…
An effect is what happened, the intent is why it happened. Our events are driving the effects but they’re also responsible for describing the intents associated with them. The why carries a lot of value because it provides inputs to business people, it’s a good way to support future business decisions beyond the software.
Reminder: Perhaps you’ve already heard about the DRY (Don’t Repeat Yourself) principle. It’s quite often misunderstood because this notion of repetition is not about the code, it’s about business behaviors. You can have some duplicated code, but if they’re called for different business reasons, it is probably a good thing to keep duplication because they may evolve differently.
So we have to ask ourselves why we want to produce an effect. For the same effect with the same intent, we want to produce the same event. For the same effect with distinct intents, we want to produce different events. Different events are important for future code updates, this will allow us to easily modify an effect for a given intent without impacting the others.
There are two ways to encode an intent in an event: in the type or in a property. Both options have their own tradeoffs for future code updates. Choosing a property is making the assumption that effects will evolve in a very similar way for all intents, choosing a dedicated type results in some code duplication but simplify code updates when effects tend to differ over time. Personally, I tend to choose type encoding by default.
Tip: Multiple commands can raise the same event as long as they share the same intent.
SNAPSHOTS AND LIFECYCLES
One thing I remember from when I was learning about CQRS/ES: snapshots were a recurring topic.
After several years, in all the code bases I’ve worked with, I have never used snapshots and I have never encountered any use case that could justify using it. Today, I even tend to think snapshots can be considered as code smell for most use cases.
When designing an aggregate, we want to control its lifecycle, how it starts, how and when it ends. With event sourcing, this means we must limit event stream length. Using snapshots potentially means our stream is not bounded because we’ve not defined a clear end to the aggregate’s lifecycle.
To me, an event stream of 10 or 30 events looks normal depending on the aggregate’s complexity, a 150-events stream is a big one but it doesn’t require a snapshot yet. There’s no hard limit, just be aware of the scales in your own systems.
To bound an event stream, we have to define an end we will always encounter:
BUSINESS RELATED LIMIT
Sometimes this emerges very naturally, for example an event Issued
for an Invoice
. Sometimes we have to define more arbitrary limits.
One of my customers was running a business with several agencies in France for the purchase and sale of valuables. We had one software to track these valuables as they needed to be moved several times for expertise before being sold again. To avoid long event stream for these objects, the solution was to end the aggregate’s lifecycle every time they leave a place (sold or transferred) and initiating a new aggregate when entering a new place (bought or transferred) with the previous valuable identity as a property.
TIME RELATED LIMIT
Think how we can design a bank account with all its associated operations. A single aggregate isn’t suitable because it can last for a very long time, maybe even longer than the lifetime of its owner. In this use case, we can place time-related limits, for example a month duration lifespan. At the beginning of each month, we’re initiating a new aggregate with the last known balance of the bank account.
CONCLUSION
To summarize, events are the central building block of a CQRS/ES implementation, they’re used for data storage and for inner communication. When designing events, we need an overall view of our system to define effects and intents. I think this is the main reason why CQRS/ES is a complex pattern to use. We also have to carefully think how long an aggregate will be used.
I hope you found these heuristics useful, it took me some time to structure my thoughts in order to formulate them.
COMMENTS
Wish to comment? Please, add your comment by sending me a pull request.