Multi tenant in a micro-service

https://softwareengineering.stackexchange.com/questions/399232

02-03-2021
|

Pergunta

We are creating a SaaS multi-tenant (separate databases) with micro-service architecture, which causing a lot of discussion between me and my manager trying to determine if it needs to be multi-tenant or the databases can be related to the service no matter which tenant it is (each service with its serving data).

From what I understand about micro-services that each service is a stand alone application down to the data layer, but the multi-tenant needs some sort of router/persistence layer and a shared database to pass the tenant identifier (which can be replaced by the API gateway policies). and hence in this way micro-service becomes an SOA as it will connect to a single point to the data layer.

Points we argued a lot:

Manageability in one database, or separate databases for each tenant, if easy in one, nightmare in the other?
- Backup
- Load Balancing
In case of a db crash, or data loss incident? which is better?
How can we measure the feasibility of multi-tenant databases approach for the micro-service application?
Security wise, is there any upper hand for on over the other, if we excluded tenant isolation in the multi tenancy?

P.S: micro-services is a must due to the fact that some services will have huge load which needs to be balanced separately

Solução

Bottom Line Up Front: You will likely have to start with a compromise.

Micro-services, and multi-tenancy are hard. You have to consider the trade-offs on cost to run, maintain, and build your solutions. The answers are going to conflict with what makes the system more robust and secure. The challenge is to figure out where your project needs to start, and what compromises you have to accept for the moment.

There are a couple axioms to keep in mind:

Complexity and cost are directly related. The more complex something is, the more expensive it will be to build it and maintain it.
Isolated systems are generally safer, but also are more complex. When two tenants data never touch, they can't affect the other.
We are not all FaceBook. Meaning that most companies have to worry about cost more than isolation and the required complexity that comes with it.

When you start breaking down the different topics, you are going to find that what is more correct for one answer is less correct for another. For example, your first topic and your second topic have different answers.

Maintainability

One thing is easier to maintain than several things. That goes even more for your database.

Having one large shared database cluster is going to be easier to manage the following:

Backup/Restore
Load balancing a cluster

At least they will up to a point. The problem you may get to is that one of your application's tenants has vastly more demands than another. If your database is a shared resource between the tenants, you will eventually run into the situation where your super users are impacting your service to the other tenants. That may not be something you have to worry about on day one.

Impact of Disasters

If your database goes down, you will need to restore the database server then restore the latest backup.

All tenants served by the database server that went down are affected.
- One database for all tenants means all your customers are affected
- Separate databases for each tenant means only that tenant is affected
Some databases are designed to scale out
- Sharding spreads the data across multiple nodes in a cluster
- Replication adds redundancy to your data spread across those nodes
- These are designed to allow a single node to be lost, and replaced without any loss of data or service

It's worth looking in to databases that are designed to scale out. Examples would be Apache Cassandra, Mongo DB, Raven DB, etc. Most NoSQL databases are designed around this concept. The upshot is that you have one "logical" database, but multiple processing nodes allow you to expand capacity as you need. It might be a worthwhile compromise to simplify your data design while having the robustness and safety you need.

Feasibility of multi-tenant database approach

That's something you'll have to evaluate. The approaches you are weighing against each other are:

One database for everything
One database per tenant
One database per micro-service
One database per micro-service per tenant (the utmost in isolation)

To perform a useful analysis of alternatives, you need to define:

Key performance areas/Requirements -- know what is important for your app
Cost of the solution
T-shirt size estimates of what it would take to implement each approach

Create the chart, see how each approach hits those check marks, and then make a decision. Remember the axiom about complexity and cost being directly related? The decision you have to make right now may not be what the pundits say is the most correct thing. You have to live within budget constraints. As your application brings in more revenue, your budget will increase, which will allow you to update your system in ways you can't consider right now.

Security

Security is a complicated topic, that has so many facets that again you have to make decisions based on the real legal requirements you have in your country, or that your clients demand. Below are a just a few security related concepts:

Non-repudiation (i.e. a user cannot deny the actions they performed)
Auditing (i.e. you can reconstruct the actions a user performed to find bad actors)
Data protection (i.e. a user cannot see information they are not allowed to see)
Infrastructure security (i.e. network access, file access, etc. are properly protected)
Data encryption (i.e. a user cannot discover someone else's data by sniffing network packets)

There is even more than that. Many security aspects will be constant across your alternatives (like encryption, infrastructure security, etc.) However, the answer to the concept of data protection is more secure if your database does not have data from multiple tenants inside of it. That may not matter if the user can't access the database directly.

When dealing with security concerns, it's best to understand what you are actually required to handle:

Are there legal requirements you need to comply with? (UK and several other countries have very strict user privacy laws, while other countries do not)
Are there standards your clients demand?
Are there simple and low cost things you can do to improve security?

Even when you consider user privacy laws, the security demands of a bank or health care system are going to be much greater than those needed for a social networking app.

Summary

Your team (manager included) need to define the following:

Requirements -- what your multi-tenant application really needs, also the security requirements
Constraints -- budget, schedule, tools (some shops will define tools that cannot be used, and others may define tools that must be used)
Key Performance Areas -- includes performance criteria, management support, etc.

Without those, you won't be able to settle on something that fits the unique demands of your application. The most correct thing is going to a bit different for each application because the unique requirements and constraints you have to work with influence what that actually is.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a softwareengineering.stackexchange