Question

I have a general question about the CQRS paradigm in general.

I understand that a CommandBus and EventBus will decouple the domain model from our Query-side datastore, the merits of eventual consistency, and being able to denormalize the storage on the Query side to optimize reads, etc. That all sounds great.

But I wonder as I begin to expand the number of the components on the Query side responsible for updating the Query datastore, if they wouldn't start to contend with one another to perform their updates?

In other words, if we tried to use a pub/sub model for the EventBus, and there were a lot of different subscribers for a particular event type, couldn't they start to contend with one another over updating various bits of denormalized data? Wouldn't this put us in the same boat as we were before CQRS?

As I've heard it explained, it sounds like CQRS is supposed to do away with this contention all together, but is this just an ideal, and in reality we're only really minimizing it? I feel like I could be missing something here, but can't put my finger on it.

Was it helpful?

Solution

it all depends on how you have designed the infrastructure. Strictly speaking, CQRS in itself doesn't say anything about how the Query models are updated. Using Events is just a one of the options you have. CQRS doesn't say anything about dealing with contention either. It's just an architectural pattern that leaves you with more options and choices to deal with things like concurrency. In "regular" architectures, such as the layered architecture, you often don't have these options at all.

If you have scaled your command processing component out on multiple machines, you can assume that they can produce more events than a single event handling component can handle. That doesn't have to be a bad thing. It may just mean that the Query models will be updated with a slightly bigger delay during peak times. If it is a problem for you, then you should consider scaling out the query models too.

The Event Handler component themselves will not be contending with each other. They can safely process events in parallel. However, if you design the system to make them all update the same data store, your data store could be the bottleneck. Setting up a cluster or dividing the query model over different data sources altogether could be a solution to your problem.

Be careful not to prematurely optimize, though. Don't scale out until you have the figures to prove that it will help in your specific case. CQRS based architectures allow you to make a lot of choices. All you need to do is make the right choice at the right time.

So far, in the application's I am involved with, I haven't come across situations where the Query model was a bottleneck. Some of these applications produce more than 100mln events per day.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top