Question

I am looking for a solution to optimize my database model. The database manages surveys, you can add questions and answer to them. An entry is an answer to a survey and has answers for each question.

Here is the simplified schema: enter image description here

Is there a way to avoid the redundancy without removing the entries table ?

Keep in mind that my model includes more tables in order to provide more complex question's types. For now I couldn't provide the full schema.

Was it helpful?

Solution

You have the feeling that either ENTRIES.survey_id or QUESTIONS.survey_id is redundant (probably because you see that one could deduce the one or the other through the ANSWERS entity).

In fact, QUESTIONS and ENTRIES entertain an identifying relationship with SURVEYS (an entry or a question may not exist without a corresponding survey). Formally speaking, their primary key must include a foreign key reference to their parent table SURVEYS:

CREATE TABLE SURVEYS (
    survey_id INT NOT NULL,
    PRIMARY KEY (survey_id)
);

CREATE TABLE QUESTIONS (
    question_id INT NOT NULL,
    survey_id INT NOT NULL,
    PRIMARY KEY (question_id, survey_id),
    FOREIGN KEY (survey_id) REFERENCES SURVEYS(survey_id)
);

CREATE TABLE ENTRIES (
    entry_id INT NOT NULL,
    survey_id INT NOT NULL,
    PRIMARY KEY (entry_id, survey_id),
    FOREIGN KEY (survey_id) REFERENCES SURVEYS(survey_id)
);

As an interesting consequence, your ANSWERS table should* actually also include survey_id in its foreign key references, because this field is part of the primary key of both QUESTIONS and ENTRIES:

CREATE TABLE ANSWERS (
    entry_id INT NOT NULL,
    survey_id INT NOT NULL,
    question_id INT NOT NULL,
    PRIMARY KEY (entry_id, survey_id, question_id),
    FOREIGN KEY (entry_id, survey_id)
        REFERENCES ENTRIES(entry_id, survey_id),
    FOREIGN KEY (question_id, survey_id)
        REFERENCES QUESTIONS(question_id, survey_id)
);

If it can help you letting go of the false impression that one survey_id field is redundant, consider that there may exist an entry with no answer (eg. when a new entry is created). In this situation, ENTRIES.survey_id is obviously required.


* Actually this extra field is required to model the constraint "an answer must relate to a question and an entry which both belong to the same survey".

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top