Abstracted References Between Entities

https://stackoverflow.com/questions/425350

06-07-2019
|

Question

An upcoming project of mine is considering a design that involves (what I'm calling) "abstract entity references". It's quite a departure from a more common data model design, but it may be necessary to achieve the flexibility we want. I'm wondering if other architects have experience with systems like this and where the caveats are.

The project has a requirement for a to control access to various entities (logically: business objects; physically: database rows) by various people. For example, we might want to create rules like:

User Alice is a member of Company Z
User Bob is the manager of Group Y, which has users Charlie, Dave, and Eve.
User Frank may enter data for [critical business object] X, and also the [critical business objects] in [critical business object group] U.
User George is not a member of Company T but may view the reports for Company T.

The idea is that we have a lot of different securable objects, roles, groups, and permissions, and we want a system to handle this. Ideally this system would require little to no coding for new situations once it's launched; it should be very flexible.

In a "traditional" data design, we might have entities/tables like this:

User
Company
User/Company Cross-Reference
UserGroup
User/UserGroup Cross-Reference
CBO ("Critical Business Object")
User/CBO Cross-Reference
CBOGroup
User/CBOGroup Cross-Reference
CBO/CBOGroup Cross-Reference
ReportAccess, which is a cross-reference between User and Company specifically for access to reports

Note the big number of cross-reference tables. This system isn't terribly flexible as any time we want to add a new means of access we'd need to introduce a new cross-reference table; that, in turn, means additional coding.

The proposed system has all of the major entities (User, Company, CBO) reference a value in a new table called Entity. (In the code we'd probably make all of these entities subclasses of an Entity superclass). Then there's two additional tables that reference Entity * Group, which is also an Entity "subclass". * EntityRelation, which is a relation between two entities of any type (including Group). This will probably also have some sort of "Relationship Type" field to explain/qualify the relationship.

This system, at least at first glance, looks like it would meet a lot of our requirements. We might introduce new Entities down the road, but we'd never need to do additional tables to handle the grouping and relationships between these entities, because Group and EntityRelation can already handle that.

I'm concerned, however, whether this might not work very well in practice. The relationships between entities would become very complex and might be very hard for people (users and developers alike) to understand them. Also, they'd be very recursive; this would make things more difficult for our SQL-dependent report writing staff.

Does anyone have experiences with a similar system?

Solution

You're modeling a set of business rules in the real world that are themselves complex. So it's not surprising that your model is going to be complex no matter how you do it.

I would recommend that you choose database design that describes the relationships more accurately, instead of trying to be clever. Your clever design may result in fewer tables (though not by an order of magnitude, actually), however you're trade-off is a lot more application code to manage it.

For example, you already know that it's going to cause confusion for users and for report designers. Another weakness is making sure the "relationship type" column contains only meaningful strings for the entities involved in the relationship. E.g. it makes sense to say Bob IsMemberOf UserGroup4, but what does it mean if CBO CanViewReportsOf Bob? Also how do you prevent mutually exclusive conditions, such as Bob IsMemberOf Company1 and Bob IsMemberOf Company2?

You have to write application code to validate the data before inserting it, and after fetching it (because your code can never be sure another part of the code hasn't introduced a data integrity bug). You may also need to write application code to perform quality control checks on the whole database, and clean up anomalies when they occur.

Compare with a database design in which it's impossible to enter invalid relationships, because the database metadata includes constraints that prevent it. This would simplify your application code a great deal.

You also identify hierarchical access privileges, like if Bob CanViewReportsOf Company1, then should he be able to view reports of any UserGroup or CBO that is a member of that company? Or do you need to enter a separate row for every entity's reports Bob can read? These are policy problems, that will exist regardless of which design you use.

To reply to your comments:

I can certainly empathize with byzantine exception-cases and evolving requirements making it hard to design simple solutions.

I worked on systems that tried to model real-world policies that grew so complex that it seemed foolish to try to codify them in software. Ultimately, the client who hired me would have used their money more effectively to hire one or two full-time administrative assistants to track their projects using paper and pencil. New exception cases that took me weeks to implement in software would have taken minutes to describe to the AA.

Automation is harder than doing things manually. The only way automation is justified is if the information needs to be tracked faster, or with higher volume, than a human could do.

OTHER TIPS

I have a weird experience with this; which is as follows:

Architect/programmer designs extermely symmetrical, generic model that looks really really neat and is very tree-ish and recursive.

When it comes to user interface design the customer or user insists that real usage is much simpler and would be satisfied with these two simple screens (user/customer draws these on a blackboard for you as you listen).

At this stage I consistently find that the solution tends to get very bloated when the underlying model supports very general use cases that no-one really wants or needs. So my basic advice is to always listen very closesly to the customer and stick very close to what the real requirements are. Make sure your personal desires for neat structures are not the driving force here.

And yes, I have experienced this a multitude of times: In my most recent experience all the developers were absolutely sure that we were talking about a hierarchical tree structure. But the customer decidedly wanted this to be flat list-like structure in all regards. We had to go full circle (implement tree first, then list) before we caved in.

I'm not entirely sure about the generic model you suggest, but it has all the smells that set me off talking about overly generic models. I would in the least always be very sure to model both alternatives in full detail before selecting.

Your Entity/Relation proposal is so "meta" that it would be flexible enough to handle all the crazy permutations - heck you're one step away from a single table with a single column that contains the path to the class which implements it's logic, been there done that - but as you point out administering it directly would result in crazy confusion. You'd need to put a nice pretty wrapper on it from the business object layer (re: single table inheritance?) to hide all the abstraction. But before you go through all that trouble, check out other already-established systems out there. Most of the time I find myself falling down this rabbit hole I end up implementing Unix file system permissions which unsurprisingly have stood the test of time.

At a previous job we went down a similar path and ended up effectively implementing active directory type permissions for data entities.

Each table that required permissions would have a foriegn key to a SecurityObject table. Rows in UserPermission and GroupPermission tables indicated the type of permissions various users had for that SecurityObject row. The data in the SecurityObject table was hierarchical - each row by default inherited permissions from its parent if it had one.

The corresponding data entity classes implemented a common interface so that the security api could work with any "securable" data without having to know exactly what it was.

UI components for controlling group and user permissions provided a common interface for managing the entity permissions.

This was used to set up permissions for an in-house database driven CMS, used to build a number of sites for use internally and externally by partner companies, and also other types of data such as access to client records.

One of the main problems we had when building it was to minimize the number of database hits. This was problematic as the hierarchical data in the SecurityObject and Group tables made efficient querying difficult - a big problem when you are doing permission checks on a lot of data entities.

This led to pretty heavy caching of data with it's own set of problems if you have your applications distributed over a number of machines.

As a result of all this I would tend to agree with some of the other posters that you need to be very sure that this is what the business requires or you may find you are wasting your time in a painful manner.

If I was doing this again I would be sure to use stored procedures for checking and managing the security objects and permissions - Insisting on using ORM entities most likely made my job a lot harder :]

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow