CQRS - How can a command properly validate when queries are required?

https://softwareengineering.stackexchange.com/questions/400329

03-03-2021
|

Pergunta

I'm aware that this question has been asked several times, but I have some concerns regarding querying from the write side that I don't see addressed in the already existing questions, more especifically regarding eventual consistency in the command model.

I have a simple CQRS + ES architecture for an application. Customers can buy stuff from my site, but there is a hardcoded requirement: A customer cannot purchase more than 500$ of products from our store. If they try, the purchase should not be accepted.

So, this is how my command handler looks like (in python, and simplified from concerns like currencies, injection for the sake of simplicity):

class NewPurchaseCommand:
    customer_id: int
    product_ids: List[int]

class PurchasesCommandHandler:
    purchase_repository: PurchaseRepository
    product_repository: ProductRepository
    customer_query_service: CustomerQueryService

    def handle(self, cmd: NewPurchaseCommand):
        current_amount_purchased = self.customer_query_service.get_total(cmd.customer_id)

        purchase_amount = 0
        for product_id in cmd.product_ids:
            product = self.product_repository.get(product_id)
            purchase_amount += product.amount

        if current_amount_purchase + purchase_amount > 500:
             raise Exception('You cannot purchase over 500$')

        new_purchase = Purchase.create(cmd.customer_id, cmd.product_ids)
        self.purchase_repository.save(new_purchase)

        # Then, after the purchase is saved, a PurchaseCreated event is persisted, 
        # sent to a queue which will then update several read projections, which one 
        # of them is the underlying table that the customer_query_service uses.

The CustomerQueryService uses an underlying table to quickly retrieve the amount that the user has purchased at the time being, and this table is exclusively used by the write side, and updated eventually:

CustomerPurchasedAmount table
CustomerId | Amount
10         | 480

While my command handler works on simple scenarios, I want to know how to handle the possible edge cases that might happen:

This user 10, which is a malicious one, makes two purchases at the same time of 20$. But since the CustomerPurchasedAmount table is updated eventually, both requests will succeed (this is the case I'm most concerned)
There might exist the possibility that some product price might change while processing the request (unlikely, but then again, it can happen).

My questions are:

How can I protect the command from the concurrency case exposed before?
How should read models specifically tailored for the write side be updated? Synchronously? Asynchronously like I'm doing right now?
And in general, how should command validation happen if the information you are querying in order to validate might be stale?

Solução

How can I avoid and protect the command from the concurrency case exposed before?

The only way to "protect" yourself against concurrent changes is to hold a lock, which effectively means to have both of the changes be part of the same thing. Once you have decided to distribute the information, concurrency is unavoidable.

In some cases, you can mitigate by rethinking the model so that you are working with immutable values. For instance, instead of asking for the price "now", you ask for the price at a particular time, and you take steps to ensure that for any given time there is only one price (think quotes, or sales; "offer good until 2019-12-31).

How should read models specifically tailored for the write side be updated? Synchronously? Asynchronously like I'm doing right now?

Usually asynchronously, but largely "it depends". The "read model" being used by the write side is closer in form to a locally cached copy. This changes the line of thinking to something more like "what happens if we have a cache miss?"

Sometimes, the right answer is to fail, sometimes the right answer is to accept the offered change provisionally, and defer complete processing until the information is available.

how should command validation happen if the information you are querying in order to validate might be stale

What I've found is that you need to stop thinking of command processing as being a linear sequence of transitions along the happy path, and begin thinking instead about the processing being a state machine.

We receive an order, and therefore we need price checks on A, B, and C. The price for A is available, so we pass that in, and now we are in a state that needs prices for B and C. No other checks complete within the time limit, so we save the current work, schedule it to be resumed later, and return a response indicating that the order is in processing.

If you want the advantages of distributed autonomous services, you have to let go of the concept of centralized control.

Outras dicas

I almost fully agree with the answer of VoiceOfUnreason. I am adding a bit about how I would validate in the command handler and how I would design the aggregates.

Aggregates as transactional Boundary : In domain driven design, aggregates are the transactional boundaries. So all business invariants that need to consistent all the time (not eventually) have to be tested at aggregates level.

Design of User Aggregate : Now you have a rule that no user should buy for more than 500$. In this case I would add the orders as a part of my user aggregate. When I get a new purchase command, I would compute the cost of each of the items and update the orders entity in my user aggregate. On updating the the user aggregate, I would fire an Ordered domain event. In other part of the system, I would handing the domain event for further processing the order (your current logic).

Large aggregates leads to centralization and limits performance : In domain driven design there is an emphasis to keep the aggregates small. All operations on the aggregate happen sequentially one after the other. So when you aggregate is big, there are many operations that happen sequentially. In the above case we will be able to place one order after the other. And if the computation of the price for items in order 1 is taking time, this will slow down placing the order 2.

Alternative design for speed : The above design is simple and if works in your performance limits, I think you should stick to it. The alternative design suggested would push the complexity of your system up. Now like Amazon India, you could add another step after placing the order such as "confirming order". So when the user order, you accept his/her order. After accepting the order, you could check for the business rules on a new aggregate such as may be "UserOrders". This way you can ensure the business rule of 500$. This will increase number of the aggregates (complexity) in the system. Also you probably will need order confirmation and rejection emails (another complexity). In this design, your users will be able to place orders super fast (benefit). It will take may be few minutes for you to confirm the order, and no body is blocked till you confirm/reject order.

Risk on enforcing business invariants with read models : The read model will be updated eventually. So if you are using read model for enforcing business invariants then we will always run in to the question if it is in a stale state or not. Having said that, we can use read models in command handlers for other purposes. For example you want so send a email when the packing of the order is completed. We may query the user read model for getting the email address. It would be okay in most cases if we would send the email to a slightly old email address.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a softwareengineering.stackexchange