Maintaining order of events with multiple consumers in a typical pub sub scenario

https://softwareengineering.stackexchange.com/questions/384817

18-02-2021
|

Pergunta

I am using Kafka. I am developing a simple e-commerce solution. I have a non-scalable catalog admin portal where products, categories, attributes, variants of products, channels, etc are updated. For each update, an event is fired which is sent to Kafka.

There can be multiple consumers deployed on different machines and they can scale up or down as per load. The consumers consume and process the events and save changes in a scalable and efficient database.
Order of events is important for me. For example, I get a product-create event. A product P is created and lies in category C. It is important that event for the creation of category C is processed before the product-create event for product P. Now if there are two consumers, and one consumer picks up product-create event for product P and the other consumer picks up event for creation of category C, it may happen product-create event is processed first, which will lead to data inconsistency.
There can be multiple such dependencies. How do I ensure the ordered processing or some alternative to ensure data consistency?

Two solutions that are right now in my mind:

We can re-queue an event until its dependent event is successfully processed.
We can wait for the dependent event to get processed and try processing the event at some intervals say 1 second with some maximum retries.

Requeuing has issues that event is now stale and no longer required. Example:

Initial Order = Create-Event(Dependent on event X), Event X, Delete-Event .
After Requeuing, Order = Event X, Delete-Event, Create-Event(Dependent on event X).
Create event is processed after delete event again leading to inconsistent data.

The same issue is applicable to the second solution (waiting and retrying).

Above issues can be solved by maintaining versions for events and ignoring an event if the targeted object(which is going to be modified by the event) has a higher version than that of the event.
But I am very unsure of the pitfalls and the challenges of the above solutions that might not be very obvious right now.

PS: Stale data works for me but there should be no inconsistencies.

Solução

I believe Kafka has the notion of "Partitions". If you put messages in the same partition they will be processed in order by a single processor

"By having a notion of parallelism—the partition—within the topics, Kafka is able to provide both ordering guarantees and load balancing over a pool of consumer processes. This is achieved by assigning the partitions in the topic to the consumers in the consumer group so that each partition is consumed by exactly one consumer in the group. By doing this we ensure that the consumer is the only reader of that partition and consumes the data in order"

Also, this post has some interesting detail on the topic

https://stackoverflow.com/questions/38024514/understanding-kafka-topics-and-partitions

HOWEVER!!

This won't help you at all if the events themselves are created in the wrong order. Which, unless you are checking whether a category exists before someone clicks the create product button is going to be problematic.

If your create category event is slow enough to need to be async, then a better solution I think would be to batch the events in the UI before sending them to be processed.

ie. I click create category, nothing is sent, I just start building a batch of messages, I click create product in that category, another message is added to the batch.

When I click "Save" the whole batch is sent as a single message to be processed as one thing, rather than many individual things, with appropriate transactions and locking to preserve data consistency.

Outras dicas

The very point of asynchronous systems such as Kafka is that you gain performance in exchange for some kind of guarantees - in your case the guarantee for in-order processing. Forcing one event to wait for another pretty much removes the benefits Kafka offers to you - if you want to go that way, better use a queueing system instead of an asynchronous one right from the start.

So, the solution which lets you keep the advantages of asynchronous processing (at the cost of code complexity) is to introduce versioning for each of your events. This will make the right ordering of events pertaining to a single entity possible: receive event, check if version is newer than the one already in database, if newer - overwrite, if not newer - ignore. Note that you can also receive exactly the same version more than once due to errors and retries. For records pertaining to different entities, the solution will depend on how you process them. You could save the received events to a local DB independently for each entity (with versioning), and then once in a while run a process which would combine the data. Then, for example if there is already a record for a product but not yet for the matching category, just ignore the product. During the next run, the category's record will probably already be there and you will be able to process the product.

There are also solutions such as Kafka Streams.

One way or another, consider if you really need an asynchronous system such as Kafka since their performance benefits can only be reaped at the cost of added complexity and in some situations you may be better off using a simpler, synchronous system.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a softwareengineering.stackexchange