Pergunta

At a conceptual level, what ways can a subscriber catch up with events it has potentially missed but needs to know about at a later time.

For example, a service is running and subscribes to events on an event bus. When the events arrive that it's interested in, it updates its persisted data to keep up-to-date. However, the service stops for some reason and is re-started some time later. During this outage it's missed some events that have been published.

How can the service catch up? Should the service be interested in catching up, or is there another way it can achieve eventual consistency?

Foi útil?

Solução

There are a number of possible solutions I can think of, but there are also a number of considerations to take into account too, such as how many publishers, how many subscribers, how big the events are, how much data is involved, how frequently the data is changed, and whether the publishers/subscribers are all known before the system starts running.

With those aside, I can think of a range of different options which may or may not suit your particular problem depending on those factors:

  1. Event Sourcing where all events are persisted in a stream or other data persistence store (NoSQL database perhaps) rather than a queue; each subscriber tracks its own progress through the stream/store in its own database.

  2. Each subscriber being provided its own dedicated, persistent queue on a middleware broker to guarantee "at least once" delivery.

  3. A subscriber is able to request a copy of the latest snapshot from the publisher, and waits for that snapshot to be received before it begins processing any deltas.

  4. The publisher is aware of its subscribers, then requires that events are always acknowledged, able to periodically retry if it doesn't receive the acknowledgement. (May require a mechanism where the subscriber can tell the publisher to start/stop, as well as a publisher retry-limit).

  5. The publisher periodically sends out full snapshots to all subscribers and the service sits idle until it receives the latest snapshot.

  6. Events don't contain data or deltas, and are used purely to notify the subscribers that something has changed; each subscriber must then pull the latest data from a source.

The simplest approach in my opinion would be to handle this with a middleware broker and ensure that the broker is configured to persist messages; that way neither the publisher nor subscriber need to care about transport or delivery, and there's no risk of the subscriber missing any events.

However, this approach is only suitable in scenarios where you know the exact number of subscribers ahead of time, with the ability to create queues on the broker before any publisher(s) start sending messages.

Licenciado em: CC-BY-SA com atribuição
scroll top