What should the payload of a “domain event” generated through “change data capture” include?

https://softwareengineering.stackexchange.com/questions/258580

05-10-2020
|

Pergunta

Using domain driven design and event sourcing ...

Given I have a table of 3 columns: (A, B, C) with an existing row of data: (1, 2, 3), when I update the row to contain values (1000, 2, 3) and I run a tool designed to capture data changes and emit associated events, which of these should I expect as the emitted event's payload?

{date: 1234, newState: (1000, 2, 3)}
{date: 1234, prevState: (1, 2, 3), newState: (1000, 2, 3)}
{date: 1234, prevState: (1, 2, 3), newState: (1000, 2, 3)}, changed: (1,0,0)}
{date: 1234, newState: (1000, 2, 3), changed: (1,0,0)}
{date: 1234, changedData: (A: 1000)}
{date: 1234, changedData: (A: 1000), previousData: (A: 1)}

"Ideally" would like the design of the events to support wide variety of current and future uses including:

data replication
audit logging
event triggered activities
retroactive event insertion (possibly)

My thoughts:

Answer 1. above is the simplest, but clients that don't care about the "A" column still end up having to react to this event, as they can't tell from the event itself what's changed.
Answer 5. above is the cleanest - it just captures the "difference". The downside is that it forces the client to roll up changes to display an audit log of the full state on each change. Also event triggered activities may require knowledge of the full state to work.
Answer 4. above is perhaps more generally useful? It carries what's changed, but some contextual information alongside it.
Answer 2. above falls into a trap of asserting what the previous state was. What if that's not the previous state in your datastore, perhaps in a test environment? Do you reject the event as invalid? Would retroactively inserting an event mean changing all subsequent events "previousState" fields?

Solução

You have two competing demands, and I think you need to choose which of them you want to win. Demand 1, to have a fixed record of the state in each event. Useful for quick auditing by scanning the raw messages

Demand 2, to have a record of the changes applied in each event, to facilitate event sourcing.

Those two demands seem only subtly different, why not do both? Within that subtlety, however, you'll find that they are utterly conflicting. Demand 1 specifies that the knowledge of, and hence control of the mutation of state occurs with the sender, as it has absolute knowledge of the state (either before or after the change). Demand 2 specifies that the knowledge of, and control of state mutation occurs in the receiver.

Demand 2 is suitable for event sourcing (recording that a change is requested), demand 2 is not, it is, as you say, more of an audit log (recording that a change has occurred)

So, for that reason, since you want to do event sourcing, the only possible workable format is 5.

Part of this is likely caused by the fact that you are coming at event sourcing by listening to mutation occurring in a database, whereas it should be the other way round, events causing the mutation to occur.

That's not to say that you can't get your scheme to work, and satisfy your requirements well, just don't think of it as event sourcing, as that's not what you actually want.

Adopting an event sourced data setup would support your needs above somewhat more easily, but adapting existing systems to that can be very involved to do.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a softwareengineering.stackexchange