Should microservices in an event sourced architecture not communicate directly with one another via REST/gRPC/etc?

https://softwareengineering.stackexchange.com/questions/401365

04-03-2021
|

Question

I'm trying to wrap my head around event sourced architectures.

It seems like common advice is to have small events with as little info in them as possible (opposed to large events with everything in them to allow other services to store their own data).

Obviously reducing coupling between services is encouraged. So much so it seems common for people to recommend not making direct calls from one service to another, and that making REST/qRPC/etc. calls to another service would be a bad thing. It seems people paint events as being the only communication that should happen between services.

However, it doesn't seem possible to me to have small events AND no direct communication between services. e.g. An order service might need to know the name of a product which is stored in the product service.

Should microservices in an event sourced architecture not communicate directly with one another via REST/gRPC/etc.?

Solution

When working with microservices, the biggest concern is the time it takes to process requests. That means you have to manage synchronous response times. The important thing is to realize the impact of design decisions. When the user makes a request from the web site, there is a period of time that it takes for a valid response.

Asynchronous Calls

Asynchronous calls allow you to provide a response to the caller as soon as possible, while allowing all the other services to catch up. The concept is known as eventual consistency. In general you are using a message queue or similar service (like Kafka) to notify other services that data has changed.

You want your messages to provide only the information required to act on it.
- You don't save anything if your other service has to call back to get the rest of the updated information
You have to decide what your guarantees are.
- Most message queue implementations will guarantee delivery of the message, even if the service is temporarily stopped
- If your microservice can't do anything with the message, you have to determine if it is OK for the service to drain the queue or to enforce back-pressure
Save the asynchronous calls for areas where you don't need to provide answers back to the user
Updating cache entries is best done asynchronously

Synchronous Calls

There is no reason why one microservice can't call another. The browser can do it, and so can your services. You just need to know what the trade-off is:

Request chaining increases the brittleness of your code (i.e. one service calls another, which calls yet another)
Request chaining also increases the time it takes to give a response to the user because...
- There is serialization/deserialization overhead with each call
- You have processing happening at each step in the path
That said, low-latency request chaining isn't necessarily evil
- API Gateways/Reverse Proxies let you load-balance calls to microservice instances
- Some endpoints let you optimize calls to others (i.e. GraphQL federation)
You should use the circuitbreaker pattern to either respond with a default response or retry the request with a different instance of a microservice
- Sometimes services get stuck in a non-responsive state due to garbage collection or some other non-deterministic reason
- By placing a maximum time you are willing to wait for a response, you limit the period of time the user has to wait

Bottom Line

You are probably going to need to judiciously use both asynchronous calls and synchronous calls in your solution. That's OK. The main advantage of notifying other services asynchronously is that you don't have to wait for everything to be updated all at once before you send a response to the user. Another advantage is that you can feed complex algorithms when data changes or is accessed without incurring the penalty of waiting for the calculation to complete.

The less time you make the user wait, the more responsive your web application will feel. Use the most appropriate means of handling a request.

OTHER TIPS

I'm trying to wrap my head around event sourced architectures.

Note that what you are describing here is an event driven architecture. Event sourced really means something different.

If we talking about a microservice architecture, then we are expecting to be able to redeploy one microservice at a time. Autonomous microservices need to be capable of making progress even when a neighbor is being redeployed. Which produces a problem - my service can't make progress if it is waiting on an answer from your service, and your service is re-deploying.

The usual answer is to work with copies of data -- my service can make progress because it cached a copy of the information that it needs from you. In other words, if we agree to exchange messages asynchronously, rather than synchronously, then we don't get cascading outages when one microservice is unavailable.

Events are "just" messages; you need to have some way to exchange messages between the microservices in your system. You choose the appropriate transport for the circumstances you face. For example, Atom Syndication / Atom Pub gives you a nice standard for adding messages to a collection that is well understood by many organizations; but you may not want those sorts of trade-offs when you are dealing with a closed collection of microservices owned by the same company and running in close proximity to one another. An alternative is to use a message bus. Or to have your microservices share a common message store.

However, it doesn't seem possible to me to have small events AND no direct communication between services. e.g. An order service might need to know the name of a product which is stored in the product service.

To some extent it comes from separation of responsibilities -- why does the order service need anything more than an opaque product id? and is that coupling an indication that there is a design flaw somewhere else?

Many of the data designs of yesteryear were tightly constrained on space; we wanted to have a single copy of each fact to save space. The trades we make today are different, and with storage cheap it isn't as important to have a single copy of data (we still do need to be thinking about which data are copies and which are authoritative).

So yes, if the order service must have the product name, then there must be a message somewhere that brings that information. And therefore that message will need some transport to get where it needs going.

See also Pat Helland: Data on the Outside...

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange