ERD - How to model a relation between two entites with a third entity as “attribute”

https://stackoverflow.com/questions/4082286

28-09-2019
|

Question

I'm modeling an entity relationship diagram and got stuck. I'm not sure if my considerations are wrong or an ERD can't modell what I want:

I have three entities: Employee, Project and Role. There is a relation between Employee and Project: an employee is working on a project. But this employee isn't just working on this project, he/she has a field of operation that is given as a role. But isn't a relation just described by attributes? How can I make something like "An employee works on this project as ..."? Of course I got use a roleId as an attribute as I would design it as a database, but what's the relation in an ERD?

Solution

How to model a relation ... entity ... attribute ?

Before I design the database I want to model the problem as an entity-relationship diagram (using Chen's notation). In this diagram I want to create a relation between employee and project without having a look at the keys and constraints that follow. Addendum: I just know relations between two entities that are extended by attributes, but how do I model this "three-entity-relation"?

That's completely understandable, and quite correct. Paper is cheap, objects in a database are a bit more expensive to change. Model the requirement and keep improving it, until you are confident, then implement.

The problem with many sites is, there are many carpenters, who although well intentioned, see every problem as a nail, and supply DDL, not the modelling assistance requested. What is missing is context and meaning, so the end result is a hard-and-fast implementation with fixed "keys" but lacking context and meaning. Modelling allows us to model various aspects that are relevant to us, without concern about what that would look like in DDL.

Another way of saying it is, OMG has answered one question how do I model "An employee works on this project as ..."? in isolation; I am answering your entire question in context.

At the logical level, many-to-many relations are correct. Such relations with no other considerations are rendered at the physical level as Associative tables. But again, it is too early to decide that, because you are still modelling the context and meaning of the relations.

... nor is it within the realm of SO markdown notation to provide it. IME, tools like Oracle Designer generate such diagrams after you've created the entities

Nonsense. The whole idea of modelling is to develop and improve something on paper, using diagrams, long before writing a line of code, or buying a platform, or having to implement DDL. The comment is about merely reverse-engineering an existing database, after the fact, which many products provide.

Example of Modelling, Progression

Use whatever symbols are meaningful to you, to model what you need. Of course standard symbols are more universally understood. Here is an
ERD for you
(I have no idea how "SO markdown notation" poses a limitation on providing before-the-fact modelling advice). I have provided a example of the progression that might occur. Nothing is "right" or "wrong", it is all bits of paper; until you decide which elements are worth confirming, and then the next progression is possible.

The starting point is of course, simple many-to-many relations, that you know some things about, as per your title. Trying to model a notional three-way relation is incorrect, a modelling error: in order to resolve a love triangle you need to first identify the discrete relations between each of the parties, separately; that means all relations are two-way only.
The Project, Employee and Role Entities are clear, and we know something about them. Here I have left the major Entities undeveloped, because they are "strong", and they are not what you are focussing on.
The progression uses example attributes of a relation, you can use your own. (Our Belgian colleague has already identified the issue in words, I am merely providing it in pictures.) There is a lot that people do not do in common practice, that they should do; I am concerned about true modelling, from the top, down, in order to progress and arrive at the correct data model. Remove anything that is rubbish, and continue progressing.
I've made assumptions that the attributes of the relation justify an Entity, so I have now drawn them in. Here I have used ovals, you can use diamonds or chevrons for all I care, just use some symbol, to model what you need.
Here comes the point where we can clearly see: we do not want Project::Employee::Role, because that would allow an Employee to perform any Role; we want Employees to be selected only if they have previously been approved for that Role. So, Employee::Role is becoming "stronger".
Therefore, Employee::Role is an Entity. And the pink Thing is a child of that specific combination or Employee+Role, not of all Employee or of all Role.
Likewise, we do not want any Employees to take any possible job in any possible Project, we want them to take only approved jobs in approved Projects. So Project::Role is becoming a strong identity, and it has attributes anyway.
Therefore, Project::Role is an Entity. And that remaining oval is a child of that specific combination of Project+Role, not of all Project.
Our pink child attains Entity status, with its specific attributes. More important, its constraints are derived from previously constrained Entities, not simple ones.
Data has a natural order or hierarchy, and a diagram drawn with that in mind is a lot easily to understand. We now have the opportunity to look at the attributes. They may have seemed the same or alike or confusing; whereas now they have clear meaning, due to context and hierarchy.

I have introduced the concept of Identifiers, without expanding it, I will leave that for discussion, if it is necessary. I think you can see that Identifiers are actually very, very important, and they are exposed as an ordinary part of the modelling exercise.

In general terms (your question, as opposed to my example progression), When we get to Normalisation, the three initial ovals may end up as one or two or remain as three objects; simple Associative tables with no attributes; or as true Entities with attributes... but we do not, and should not care about that right now. And again, it is too early for DDL, or for Normalisation at this stage. We have little idea what the keys are; what attributes are associated with them; and in what relationship to them. What's more, we don't care. In terms of the example, yes, the Entities are clear and unambiguous.

Feedback please, so that you can progress.

Edit: Diagram updated, multi-page.

OTHER TIPS

EMPLOYEE

employee_id (pk)

PROJECT

project_id (pk)
project_description

ROLE

role_id (pk)
role_description

If an employee can only have one role per project:

EMPLOYEE_PROJECT_MAP

project_id (pk, fk to PROJECT)
employee_id (pk, fk to EMPLOYEE)
role_id (fk to ROLE)

If an employee can only have 1+ role per project:

EMPLOYEE_PROJECT_MAP

project_id (pk, fk to PROJECT)
employee_id (pk, fk to EMPLOYEE)
role_id (pk, fk to ROLE)

The difference between the two is the composite primary key includes role in the latter version. Being a composite primary key of all three columns, the combination of values must be unique, making the following is valid:

project_id  employee_id  role_id
---------------------------------
1           1            1
1           1            2

Whereas if role_id is not included in the composite primary key, only one combination of user and project can be made - which means a user could only have one role.

A CHECK constraint wouldn't work - it only checks the row, not the entire table. While a trigger would work, why bother when you can enforce the relationship via a composite primary key or unique constraint? A trigger wouldn't be visible in an ERD, nor statements like CREATE TABLE or DESC table_name.

"Before I design the database I want to model the problem as an entity-relationship diagram (using Chen's notation). In this diagram I want to create a relation between employee and project without having a look at the keys and constraints that follow."

If the relationship "works-on" between the two (Employee and project) is many-to-many, AND that relationship has further attributes describing(/providing further detail about) (occurences of) the relationship, then you often have no other choice but to "instantiate" the relationship, i.e. defining it as an extra entity. Some tools support an ERD dialect that allows specifying additional attributes for any relationship (in a rounded box arrowing to the relationship arrow), but this is imo not common practice.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow