Question

In dealing with distributed data across Microservices, common solutions I've read for sharing this data are:

a) using a local cache where the cache is updated from domain events coming from the service that owns that data,

b) creating a materialized view, updating via domain events or database triggers.

Say we have services for Orders and Customers and we introduce a new Coupons service that needs to initialize all Customers with a Coupon based on their Sales data. How does this new service get all the existing Sales and Customer data when it's first launched? Is it standard to seed the cache or materialized view with some data loading script first, before switching to event- or trigger-based updates after that?

Was it helpful?

Solution

Streams - Events and Snapshots

Most modern filesystems contain an event log. Everytime a file is altered, created, deleted, moved, etc... a new log entry is added with the next id.

A program can monitor the state of the file-system by looking at this log and noting the new entries from the last entry it read. Using its own internal data representation and those updates it can keep current with the file-system.

Sometimes though the last log entry the program read has fallen off the log. So it cannot be sure of what else has dropped off the queue. The only way to restore state is to read a snapshot of the file system from a recent point in time, guessing what happened in the filesystem. Then catch up with the latest events (hopefully the log hasn't overflowed again).

So yes it does make sense to perform a bulk-load, or load from a complete source representation. However I wouldn't have another process responsible for this, it is the programs job itself to manage its own state, it should know how to get itself going from a cold start, or a long slumber.

Licensed under: CC-BY-SA with attribution
scroll top