سؤال

I have implemented the following ways of storing relational topology:

1.A general junction relation table:

Table: Relation

Columns: id parent_type parent_id parent_prop child_type child_id child_prop

On which joins are not generally capable of being executed against by most sql engines.

2.Relation specific junction tables

Table: Class2Student

Columns: id parent_id parent_prop child_id child_prop

On which joins are capable of being executed against.

3.Storing lists/string maps of related objects in a text field on both bidirectional objects.

Class: Class

Class properties: id name students

Table columns: id name students_keys

Rows: 1 "history" [{type:Basic_student,id:1},{type:Advanced_student,id:3}]

To enable joins by the sql engines, it would be possible to write a custom module which would be made even easier if the contents of students_keys was simply [1,3], ie that a relation was to the explicit Student type.

The questions are the following in the context of:

I fail to see what the point of a junction table is. For example, I fail to see that any problems the following arguments for a junction table claim to relieve, actually exist:

  • Inability to logically correctly save a bidirectional relations (eg there is no data orphaning in bidirectional relations or any relations with a keys field, because one recursively saves and one can enforce other operations (delete,update) quite easily)
  • Inability to join effectively

I am not soliciting opinions on your personal opinions on best practices or any cult-like statements on normalization.

The explicit question(s) are the following:

  1. What are the instances where one would want to query a junction table that is not provided by querying a owning object's keys field?
  2. What are logical implementation problems in the context of computation provided by the sql engine where the junction table is preferable?
  3. The only implementation difference with regards to a junction table vs a keys fields is the following:

When searching for a query of the following nature you would need to match against the keys field with either a custom indexing implementation or some other reasonable implementation:

class_dao.search({students:advanced_student_3,name:"history"});

search for Classes that have a particular student and name "history"

As opposed to searching the indexed columns of the junction table and then selecting the approriate Classes.

I have been unable to identify answers why a junction table is logically preferable for quite literally any reason. I am not claiming this is the case or do I have a religious preference one way or another as evidenced by the fact that I implemented multiple ways of achieving this. My problem is I do not know what they are.

هل كانت مفيدة؟

المحلول

The way I see it, you have have several entities

CREATE TABLE StudentType
(
    Id Int PRIMARY KEY,
    Name NVarChar(50) 
);

INSERT StudentType VALUES
(
    (1, 'Basic'),
    (2, 'Advanced'),
    (3, 'SomeOtherCategory')
);

CREATE TABLE Student
(
    Id Int PRIMARY KEY,
    Name NVarChar(200),
    OtherAttributeCommonToAllStudents Int,
    Type Int,
    CONSTRAINT FK_Student_StudentType
        FOREIGN KEY (Type) REFERENCES StudentType(Id)
)

CREATE TABLE StudentAdvanced
(
    Id Int PRIMARY KEY,
    AdvancedOnlyAttribute Int,
    CONSTRIANT FK_StudentAdvanced_Student
        FOREIGN KEY (Id) REFERENCES Student(Id)
)

CREATE TABLE StudentSomeOtherCategory
(
    Id Int PRIMARY KEY,
    SomeOtherCategoryOnlyAttribute Int,
    CONSTRIANT FK_StudentSomeOtherCategory_Student
        FOREIGN KEY (Id) REFERENCES Student(Id)
)
  1. Any attributes that are common to all students have columns on the Student table.
  2. Types of student that have extra attributes are added to the StudentType table.
  3. Each extra student type gets a Student<TypeName> table to store its specific attributes. These tables have an optional one-to-one relationship with Student.

I think that your "straw-man" junction table is a partial implementation of an EAV anti-pattern, the only time this is sensible, is when you can't know what attributes you need to model, i.e. your data will be entirely unstructured. When this is a real requirment, relational databases start to look less desirable. On those occasions consider a NOSQL/Document database alternative.


A junction table would be useful in the following scenario.

Say we add a Class entity to the model.

CREATE TABLE Class
(
    Id Int PRIMARY KEY,
    ...
)

Its concievable that we would like to store the many-to-many realtionship between students and classes.

CREATE TABLE Registration
(
    Id Int PRIMARY KEY,
    StudentId Int,
    ClassId Int,
    CONSTRAINT FK_Registration_Student
        FOREIGN KEY (StudentId) REFERENCES Student(Id),
    CONSTRAINT FK_Registration_Class
        FOREIGN KEY (ClassId) REFERENCES Class(Id)
)

This entity would be the right place to store attributes that relate specifically to a student's registration to a class, perhaps a completion flag for instance. Other data would naturally relate to this junction, pehaps a class specific attendance record or a grade history.

If you don't relate Class and Student in this way, how would you select both, all the students in a class, and all the classes a student reads. Performance wise, this is easily optimised by indices on key columns.


When a many-to-many realtionships exists without any attributes I agree that logically, the junction table needn't exist. However, in a relational database, junction tables are still a useful physical implmentaion, perhaps like this,

CREATE TABLE StudentClass
(
    StudentId Int,
    ClassId Int,
    CONSTRAINT PK_StudentClass PRIMARY KEY (ClassId, StudentId),
    CONSTRAINT FK_Registration_Student
        FOREIGN KEY (StudentId) REFERENCES Student(Id),
    CONSTRAINT FK_Registration_Class
        FOREIGN KEY (ClassId) REFERENCES Class(Id)
)

this allows simple queries like

// students in a class?
SELECT StudentId
FROM StudentClass
WHERE ClassId = @classId

// classes read by a student?
SELECT ClassId
FROM StudentClass
WHERE StudentId = @studentId

additionaly, this enables a simple way to manage the relationship, partially or completely from either aspect, that will be familar to relational database developers and sargeable by query optimisers.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top