Looking for clarification on pub/sub systems and how an API gateway comes into play

https://softwareengineering.stackexchange.com/questions/400674

04-03-2021
|

题

I'm new to micro service architecture and I'm looking to understand how services should/can interact with each other.

The acceptance for my current story requires me to:

Email a document
Backup the document emailed
Persist related information to the backup

My client is an existing monolith project that we developed over the last 3 years. The goal here is to slowly build away from this large project into small micro service projects.

So my monolith project can use a connected service to call one (or many) API calls. Nothing has to be returned to this client so a 'fire and forget' post call is sufficient in this scenario.

So my iterative approach was to make the send email service first. It worked great. My monolith client called the send email which existed in a micro service.

Then I started to add the backup. It already had a code smell. I was unable to send without backing up so already I was coupling two requirements. But I went ahead anyway.

As I was about to add the persistence, i stopped myself. If backing up and sending were bad enough for a code smell from coupling, sending, backing up and persisting wasn't acceptable.

So the way I see it, is there are three services waiting to be made here; email, backup and persist.

Which brings me to my question. How do I architect a solution to make this work?

Do I need an API gateway to accept a larger DTO then the gateway splits the data into smaller DTOs to send the the appropriate services?

How do I handle the fact that each service id dependent on the previous call? e.g. Email sends, needs to emit the result, backup happens and it needs to emit it's file path for example.

I've researched RabbitMQ and I think I understand the idea but another thing I'm struggling with is, all of the diagrams I see shows client, pointing to a gateway that calls many API end points but then the implementation examples I see always have the 'message queue' services as console apps

解决方案

So the way I see it, is there are three services waiting to be made here; email, backup and persist.

Not necessarily. It is not always obvious how the larger system should be split into distinct services, so your difficulties are understandable. Don't hesitate to try multiple approaches, and see what makes more sense.

One approach would be to consider the domains which would abstract away some underlying logic and could be reusable in the future for other services. In your case, I see three domains:

Sending e-mails.

This one is a good candidate for a distinct service. Perfectly reusable, its task would be just that: take some information, such as a message, its attachments, etc., and do what needs to be done in order for the e-mail message to be sent. When using this service, the caller shouldn't be concerned about the internals, i.e. how the actual e-mail is generated and sent.
Backing up information.

Here too, it could be a perfectly reusable service. Its task is to make backups: not backups of the specific type of document, but just backups of generic information, would it be BLOBs, or structured information, or structured information attached to a BLOB. The callers of this API shouldn't be bothered about the internals such as how exactly the information is replicated and where, would it be on an off-site server or Amazon Glacier.
Orchestrating the thing.

This is your third service which, supposedly, is the one which would be called by the customers. Alas, it's tied to the specificity of the business domain and is not reusable. Its goal is to abstract the calls to two other services and, more importantly, to handle properly the different problems which could result when calling the other services (such as the e-mail service not being available: would you still do a backup of the document, or not?)

Note that the first two services are not coupled to anything: e-mail service doesn't need to do backups, and backup service doesn't need to send e-mails. Maybe in the future they will, but this is a different story.

the emailing service will need to generate the email then that email is backed up so by that logic, the backup service needs to be called from the sending service unless the sending service returns the email being sent each time and the orchestrator handles it?

If you actually need to backup the SMTP message being sent, then yes, the e-mail service will use the backup service for this task.

This, however, is a very unusual requirement, so make sure why your product owner (PO) is requesting that from you. Maybe the PO just wants this for debugging purposes, in which case logging is more appropriate.

How do I handle the fact that each service [is] dependent on the previous call?

You don't.

The orchestrator creates a UUID and communicates it to the services it calls. This way, you can for instance correlate the logs of the e-mail service to the contents of the backup. It can call them in parallel, if this is what you want. Or it can call them consecutively, and handle the errors accordingly (for instance if it doesn't make sense to do a backup if the e-mail wasn't sent).

does that mean the API (orchestrator) that my client interacts with generates a GUID and passing this along into subsequent HTTP calls?

Essentially yes. Or the client can do it as well, if the client performs multiple logical calls.

I've researched RabbitMQ and I think I understand the idea but another thing I'm struggling with is, all of the diagrams I see shows client, pointing to a gateway that calls many API end points but then the implementation examples I see always have the 'message queue' services as console apps

Those are two popular (but not only) ways to interact between services.

Direct HTTP calls represent a request-response model. You, as a client, make a request to a service and expect the service to reply to you. For instance, if I query a service which converts one currency to another, I perform a request specifying the amount and the two currencies, and expect an amount (or an error message) as a response.
RabbitMQ represents a message queue service model. You, as a client, put a request in a queue, without blocking until you get a response. If you don't need a response, that's it; your job is finished. If you need a response, you subscribe to a given set of messages, and react when you get the message you expect.

Both models are sometimes used together, and each model is sometimes misused to do what the other one does the best: you can, technically, fire and forget with HTTP calls, and you can, technically, do RPC-style calls with RabbitMQ.

其他提示

Your main problem, if I correctly understand it, is that you're trying to implement the architecture like:

[Monolith]

|--> [Email]

|--> [Backup]

|--> [Related]

when what you possibly want is:

[Monolith] --> [Email] --> [Backup] --> [Related]

by which I mean:

Event: Something happens in Monolith. It tells the Email microservice to send an email. Then it goes back to its happy life.
Event: Something tells the Email microservice to send an email. It sends it off.
Event: Something tells the Email microservice a send result. It tells the Backup microservice to persist a wad of data.

..etc

If that's clear enough. You don't need any 'real' orchestration because nothing upstream cares about anything downstream.

许可以下： CC-BY-SA 和归因

不隶属于 softwareengineering.stackexchange