Question

Someone suggested to me to use a table as described here in a project, and although I can't say why, I don't think it is a good idea.

MyTable (MyTableId PK, Type INT NOT NULL, MyForeignKey INT NOT NULL)

MyForeignKey can point to data in various tables depending on the value of Type. Of course, we cannot enforce FK integrity using such a model, but is this argument enough not to use it?

I'll give you an example to where this could be used. Let's say you have a Notes table in a system to save Notes about various objects; notes about Users, about Documents, etc. Normally, I would modelize this like this:

Notes (NoteId PK, Text VARCHAR(4000), UserId INT NULL, DocumentId INT NULL, ...)

What my colleague proposes is to have such a table instead:

Notes (NoteId PK, Text VARCHAR(4000), ObjectType, ObjectId)

With the second implementation, ObjectType would tell us whether ObjectId is pointing to a row in the Users table or the Documents table. It has the advantage that the database structure and the code needs less modification if we want to add another type of objects.

What are the pros and cons of each solution?

Note: We will never have zillions of object types. It should remain below 10.

In fact our real life scenario is a bit more complex. It has to do with a permission system where Users may or may not have access to various types of objects (Documents, Notes, Events, etc.)

So for now in my database model, I have tables for these objects and additional tables to make relationship between them and users (UserDocuments, UserNotes, UserEvents, etc.) Permissions are set through Attributes in these link tables.

My colleague is proposing to have a single Permissions table instead like this

Permissions (PermissionId PK, UserId INT, ObjectType, ObjectId, ... other permission fields...)

Is this a good idea?

Also, can we call this EAV or Open Schema? It is not exactly like what I have read on those topics.

Was it helpful?

Solution

Well, in my honest opinion bot solutions are correct, but for different purposes. As you mentioned number of such types counts, and this is usually the only thing that drives the decision.

I assume we are talking about OLTP system. So, if the number of types is relatively small, and you are sure it will not change (especially won't grow up) you should choose the solution with separate columns for different types.

If the number of types is big (in my opinion 10 is already a big number) you should choose the solution with type and key column.

In general multicolumn variant is better, because you can use DBMS mechanisms to keep data integrity, wile when choosing type/key columns variant you have to develop those mechanisms on your own (i.e. triggers and procedures). Sometimes it is better to choose type/key column variant because the development and maintenance effort isn't as big as structure change effort - if you don't see any problems adding or removing FK columns to your tables during production use of your system you can still choose multicolumn variant (it is only a storage cost in that case).

OTHER TIPS

It's a bad idea. When you mix data of several types in the same column, and add an adjacent type column to disambiguate, you nearly always end up regretting the consequences.

If you do your joins correctly, your join conditions will be more complicated, and your joins will run slower. If you do your joins incorrectly, you'll get bugs. And forgetting to check the type column is a frequent source of bugs.

It's better to put pointers to different tables in different columns. But there may be a very different design that will work even better.

Your collection of objects of multiple types seems a good fit for a design technique called "Class Table Hierarchy". This technique basically uses separate tables for classes and subclasses, with shared primary keys. The shared primary keys require code in your application to propagate a primary key from a class table to a subclass table. But it's worth the effort.

A separate type column is unnecessary, because a join between a class table and a subclass table will automatically weed out rows that pertain to a different class. It's slick, easy and fast.

If you'll look for examples of Class Table Inheritance or Relational modeling of subclasses here in SO, you'll get dozens of informative Q&As.


Edited to change "Class Table Hierarchy" to "Class Table Inheritance".

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top